You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@helix.apache.org by zz...@apache.org on 2013/09/03 07:08:43 UTC
git commit: [HELIX-215] YAML-based configuration, new recipe that uses YAML and USER_DEFINED rebalancer, rb=13930

Updated Branches:
  refs/heads/master a019f3b26 -> c57426501


[HELIX-215] YAML-based configuration, new recipe that uses YAML and USER_DEFINED rebalancer, rb=13930


Project: http://git-wip-us.apache.org/repos/asf/incubator-helix/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-helix/commit/c5742650
Tree: http://git-wip-us.apache.org/repos/asf/incubator-helix/tree/c5742650
Diff: http://git-wip-us.apache.org/repos/asf/incubator-helix/diff/c5742650

Branch: refs/heads/master
Commit: c57426501773f4dd36080b934a4e75fd50bc54d3
Parents: a019f3b
Author: zzhang <zz...@uci.edu>
Authored: Mon Sep 2 22:08:34 2013 -0700
Committer: zzhang <zz...@uci.edu>
Committed: Mon Sep 2 22:08:34 2013 -0700

----------------------------------------------------------------------
 src/site/markdown/Architecture.md               |  91 +++---
 src/site/markdown/Concepts.md                   |  26 +-
 src/site/markdown/Quickstart.md                 |  10 +-
 src/site/markdown/Tutorial.md                   |  22 +-
 src/site/markdown/index.md                      |  25 +-
 src/site/markdown/recipes/lock_manager.md       |   2 +-
 .../markdown/recipes/rabbitmq_consumer_group.md |   4 +-
 .../markdown/recipes/user_def_rebalancer.md     | 287 +++++++++++++++++++
 src/site/markdown/tutorial_controller.md        |   2 +-
 src/site/markdown/tutorial_messaging.md         |   4 +-
 src/site/markdown/tutorial_participant.md       |   5 +-
 src/site/markdown/tutorial_rebalance.md         |  51 ++--
 src/site/markdown/tutorial_spectator.md         |   4 +-
 .../markdown/tutorial_user_def_rebalancer.md    | 196 +++++++++++++
 src/site/markdown/tutorial_yaml.md              |  98 +++++++
 src/site/site.xml                               |   1 +
 16 files changed, 712 insertions(+), 116 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-helix/blob/c5742650/src/site/markdown/Architecture.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/Architecture.md b/src/site/markdown/Architecture.md
index 7acf590..ac96443 100644
--- a/src/site/markdown/Architecture.md
+++ b/src/site/markdown/Architecture.md
@@ -29,16 +29,16 @@ Helix aims to provide the following abilities to a distributed system:
 * Monitor cluster health and provide alerts on SLA violation.
 * Service discovery mechanism to route requests.
 
-To build such a system, we need a mechanism to co-ordinate between different nodes/components in the system. This mechanism can be achieved with a software that reacts to any change in the cluster and comes up with a set of tasks needed to bring the cluster to a stable state. The set of tasks will be assigned to one or more nodes in the cluster. Helix serves this purpose of managing the various components in the cluster.
+To build such a system, we need a mechanism to co-ordinate between different nodes and other components in the system. This mechanism can be achieved with software that reacts to any change in the cluster and comes up with a set of tasks needed to bring the cluster to a stable state. The set of tasks will be assigned to one or more nodes in the cluster. Helix serves this purpose of managing the various components in the cluster.
 
 ![Helix Design](images/system.png)
 
 Distributed System Components
 
-In general any distributed system cluster will have the following
+In general any distributed system cluster will have the following components and properties:
 
-* Set of nodes also referred to as an instance.
-* Set of resources which can be a database, lucene index or a task.
+* A set of nodes also referred to as instances.
+* A set of resources which can be databases, lucene indexes or tasks.
 * Each resource is also partitioned into one or more Partitions. 
 * Each partition may have one or more copies called replicas.
 * Each replica can have a state associated with it. For example Master, Slave, Leader, Standby, Online, Offline etc
@@ -48,47 +48,46 @@ Roles
 
 ![Helix Design](images/HELIX-components.png)
 
-Not all nodes in a distributed system will perform similar functionality. For e.g, a few nodes might be serving requests, few nodes might be sending the request and some nodes might be controlling the nodes in the cluster. Based on functionality we have grouped them into
+Not all nodes in a distributed system will perform similar functionalities. For example, a few nodes might be serving requests and a few nodes might be sending requests, and some nodes might be controlling the nodes in the cluster. Thus, Helix categorizes nodes by their specific roles in the system.
 
-We have divided Helix in 3 logical components based on their responsibility 
-
-1. PARTICIPANT: The nodes that actually host the distributed resources.
-2. SPECTATOR: The nodes that simply observe the PARTICIPANT State and route the request accordingly. Routers, for example, need to know the Instance on which a partition is hosted and its state in order to route the request to the appropriate end point.
-3. CONTROLLER: The controller observes and controls the PARTICIPANT nodes. It is responsible for coordinating all transitions in the cluster and ensuring that state constraints are satisfied and cluster stability is maintained. 
+We have divided Helix nodes into 3 logical components based on their responsibilities:
 
+1. Participant: The nodes that actually host the distributed resources.
+2. Spectator: The nodes that simply observe the Participant state and route the request accordingly. Routers, for example, need to know the instance on which a partition is hosted and its state in order to route the request to the appropriate end point.
+3. Controller: The controller observes and controls the Participant nodes. It is responsible for coordinating all transitions in the cluster and ensuring that state constraints are satisfied and cluster stability is maintained. 
 
 These are simply logical components and can be deployed as per the system requirements. For example:
 
-1. Controller can be deployed as a separate service
-2. Controller can be deployed along with a Participant but only one Controller will be active at any given time.
+1. The controller can be deployed as a separate service
+2. The controller can be deployed along with a Participant but only one Controller will be active at any given time.
 
 Both have pros and cons, which will be discussed later and one can chose the mode of deployment as per system needs.
 
 
-## Cluster state/metadata store
+## Cluster state metadata store
 
 We need a distributed store to maintain the state of the cluster and a notification system to notify if there is any change in the cluster state. Helix uses Zookeeper to achieve this functionality.
 
 Zookeeper provides:
 
 * A way to represent PERSISTENT state which basically remains until its deleted.
-* A way to represent TRANSIENT/EPHEMERAL state which vanishes when the process that created the STATE dies.
-* Notification mechanism when there is a change in PERSISTENT/EPHEMERAL STATE
+* A way to represent TRANSIENT/EPHEMERAL state which vanishes when the process that created the state dies.
+* Notification mechanism when there is a change in PERSISTENT and EPHEMERAL state
 
-The namespace provided by ZooKeeper is much like that of a standard file system. A name is a sequence of path elements separated by a slash (/). Every node[ZNODE] in ZooKeeper\'s namespace is identified by a path.
+The namespace provided by ZooKeeper is much like that of a standard file system. A name is a sequence of path elements separated by a slash (/). Every node[ZNode] in ZooKeeper\'s namespace is identified by a path.
 
-More info on Zookeeper can be found here http://zookeeper.apache.org
+More info on Zookeeper can be found at http://zookeeper.apache.org
 
-## Statemachine and constraints
+## State machine and constraints
 
-Even though the concept of Resource, Partition, Replicas is common to most distributed systems, one thing that differentiates one distributed system from another is the way each partition is assigned a state and the constraints on each state.
+Even though the concepts of Resources, Partitions, and Replicas are common to most distributed systems, one thing that differentiates one distributed system from another is the way each partition is assigned a state and the constraints on each state.
 
 For example:
 
-1. If a system is serving READ ONLY data then all partition\'s replicas are equal and they can either be ONLINE or OFFLINE.
-2. If a system takes BOTH READ and WRITES but ensure that WRITES go through only one partition then the states will be MASTER, SLAVE and OFFLINE. Writes go through the MASTER and is replicated to the SLAVES. Optionally, READS can go through SLAVES.
+1. If a system is serving read-only data then all partition\'s replicas are equal and they can either be ONLINE or OFFLINE.
+2. If a system takes _both_ reads and writes but ensure that writes go through only one partition, the states will be MASTER, SLAVE, and OFFLINE. Writes go through the MASTER and replicate to the SLAVEs. Optionally, reads can go through SLAVES.
 
-Apart from defining STATE for each partition, the transition path to each STATE can be application specific. For example, in order to become MASTER it might be a requirement to first become a SLAVE. This ensures that if the SLAVE does not have the data as part of OFFLINE-SLAVE transition it can bootstrap data from other nodes in the system.
+Apart from defining state for each partition, the transition path to each state can be application specific. For example, in order to become MASTER it might be a requirement to first become a SLAVE. This ensures that if the SLAVE does not have the data as part of OFFLINE-SLAVE transition it can bootstrap data from other nodes in the system.
 
 Helix provides a way to configure an application specific state machine along with constraints on each state. Along with constraints on STATE, Helix also provides a way to specify constraints on transitions.  (More on this later.)
 
@@ -113,17 +112,17 @@ MASTER  | SLAVE    | SLAVE  |   N/A   |
 
 The following terminologies are used in Helix to model a state machine.
 
-* IDEALSTATE: The state in which we need the cluster to be in if all nodes are up and running. In other words, all state constraints are satisfied.
-* CURRENTSTATE: Represents the current state of each node in the cluster 
-* EXTERNALVIEW: Represents the combined view of CURRENTSTATE of all nodes.  
+* IdealState: The state in which we need the cluster to be in if all nodes are up and running. In other words, all state constraints are satisfied.
+* CurrentState: Represents the actual current state of each node in the cluster 
+* ExternalView: Represents the combined view of CurrentState of all nodes.  
 
-The goal of Helix is always to make the CURRENTSTATE of the system same as the IDEALSTATE. Some scenarios where this may not be true are:
+The goal of Helix is always to make the CurrentState of the system same as the IdealState. Some scenarios where this may not be true are:
 
 * When all nodes are down
 * When one or more nodes fail
 * New nodes are added and the partitions need to be reassigned
 
-### IDEALSTATE
+### IdealState
 
 Helix lets the application define the IdealState on a resource basis which basically consists of:
 
@@ -140,11 +139,11 @@ Example:
 * .....
 * Partition-p, replica-3, Slave, Node-n
 
-Helix comes with various algorithms to automatically assign the partitions to nodes. The default algorithm minimizes the number of shuffles that happen when new nodes are added to the system
+Helix comes with various algorithms to automatically assign the partitions to nodes. The default algorithm minimizes the number of shuffles that happen when new nodes are added to the system.
 
-### CURRENTSTATE
+### CurrentState
 
-Every instance in the cluster hosts one or more partitions of a resource. Each of the partitions has a State associated with it.
+Every instance in the cluster hosts one or more partitions of a resource. Each of the partitions has a state associated with it.
 
 Example Node-1
 
@@ -154,9 +153,9 @@ Example Node-1
 * ....
 * Partition-p, Slave
 
-### EXTERNALVIEW
+### ExternalView
 
-External clients needs to know the state of each partition in the cluster and the Node hosting that partition. Helix provides one view of the system to SPECTATORS as EXTERNAL VIEW. EXTERNAL VIEW is simply an aggregate of all CURRENTSTATE
+External clients needs to know the state of each partition in the cluster and the Node hosting that partition. Helix provides one view of the system to Spectators as _ExternalView_. ExternalView is simply an aggregate of all node CurrentStates.
 
 * Partition-1, replica-1, Master, Node-1
 * Partition-1, replica-2, Slave, Node-2
@@ -171,28 +170,28 @@ Mode of operation in a cluster
 
 A node process can be one of the following:
 
-* PARTICIPANT: The process registers itself in the cluster and acts on the messages received in its queue and updates the current state.  Example: Storage Node
-* SPECTATOR: The process is simply interested in the changes in the Externalview. The Router is a spectator of the Storage cluster.
-* CONTROLLER: This process actively controls the cluster by reacting to changes in Cluster State and sending messages to PARTICIPANTS.
+* Participant: The process registers itself in the cluster and acts on the messages received in its queue and updates the current state.  Example: a storage node in a distributed database
+* Spectator: The process is simply interested in the changes in the Externalview.
+* Controller: This process actively controls the cluster by reacting to changes in cluster state and sending messages to Participants.
 
 
 ### Participant Node Process
 
-* When Node starts up, it registers itself under LIVEINSTANCES
-* After registering, it waits for new Messages in the message queue
+* When Node starts up, it registers itself under _LiveInstances_
+* After registering, it waits for new _Messages_ in the message queue
 * When it receives a message, it will perform the required task as indicated in the message
-* After the task is completed, depending on the task outcome it updates the CURRENTSTATE
+* After the task is completed, depending on the task outcome it updates the CurrentState
 
 ### Controller Process
 
-* Watches IDEALSTATE
-* Node goes down/comes up or Node is added/removed. Watches LIVEINSTANCES and CURRENTSTATE of each Node in the cluster
-* Triggers appropriate state transition by sending message to PARTICIPANT
+* Watches IdealState
+* Notified when a node goes down/comes up or node is added/removed. Watches LiveInstances and CurrentState of each node in the cluster
+* Triggers appropriate state transitions by sending message to Participants
 
 ### Spectator Process
 
-* When the process starts, it asks cluster manager agent to be notified of changes in ExternalView
-* Whenever it receives a notification, it reads the Externalview and performs required duties. For the Router, it updates its routing table.
+* When the process starts, it asks the Helix agent to be notified of changes in ExternalView
+* Whenever it receives a notification, it reads the Externalview and performs required duties.
 
 #### Interaction between controller, participant and spectator
 
@@ -212,11 +211,11 @@ The following picture shows how controllers, participants and spectators interac
 * If a task is dependent on another task being completed, do not add that task
 * After any task is completed by a Participant, Controllers gets notified of the change and the State Transition algorithm is re-run until the CurrentState is same as IdealState.
 
-## Helix znode layout
+## Helix ZNode layout
 
 Helix organizes znodes under clusterName in multiple levels. 
 
-The top level (under clusterName) znodes are all Helix defined and in upper case:
+The top level (under the cluster name) ZNodes are all Helix-defined and in upper case:
 
 * PROPERTYSTORE: application property store
 * STATEMODELDEFES: state model definitions
@@ -227,7 +226,7 @@ The top level (under clusterName) znodes are all Helix defined and in upper case
 * LIVEINSTANCES: live instances
 * CONTROLLER: cluster controller runtime information
 
-Under INSTANCES, there are runtime znodes for each instance. An instance organizes znodes as follows:
+Under INSTANCES, there are runtime ZNodes for each instance. An instance organizes ZNodes as follows:
 
 * CURRENTSTATES
     * sessionId

http://git-wip-us.apache.org/repos/asf/incubator-helix/blob/c5742650/src/site/markdown/Concepts.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/Concepts.md b/src/site/markdown/Concepts.md
index 02d7406..e6bcca0 100644
--- a/src/site/markdown/Concepts.md
+++ b/src/site/markdown/Concepts.md
@@ -48,7 +48,7 @@ Consider a simple case where you want to launch a task \'myTask\' on node \'N1\'
 ```
 ### Partition
 
-If this task get too big to fit on one box, you might want to divide it into subTasks. Each subTask is referred to as a _partition_ in Helix. Let\'s say you want to divide the task into 3 subTasks/partitions, the IdealState can be changed as shown below. 
+If this task get too big to fit on one box, you might want to divide it into subtasks. Each subtask is referred to as a _partition_ in Helix. Let\'s say you want to divide the task into 3 subtasks/partitions, the IdealState can be changed as shown below. 
 
 \'myTask_0\', \'myTask_1\', \'myTask_2\' are logical names representing the partitions of myTask. Each tasks runs on N1, N2 and N3 respectively.
 
@@ -74,7 +74,7 @@ If this task get too big to fit on one box, you might want to divide it into sub
 
 ### Replica
 
-Partitioning allows one to split the data/task into multiple subparts. But let\'s say the request rate each partition increases. The common solution is to have multiple copies for each partition. Helix refers to the copy of a partition as a _replica_.  Adding a replica also increases the availability of the system during failures. One can see this methodology employed often in Search systems. The index is divided into shards, and each shard has multiple copies.
+Partitioning allows one to split the data/task into multiple subparts. But let\'s say the request rate for each partition increases. The common solution is to have multiple copies for each partition. Helix refers to the copy of a partition as a _replica_.  Adding a replica also increases the availability of the system during failures. One can see this methodology employed often in search systems. The index is divided into shards, and each shard has multiple copies.
 
 Let\'s say you want to add one additional replica for each task. The IdealState can simply be changed as shown below. 
 
@@ -106,7 +106,7 @@ For increasing the availability of the system, it\'s better to place the replica
 
 ### State 
 
-Now let\'s take a slightly complicated scenario where a task represents a database.  Unlike an index which is in general read-only, a database supports both reads and writes. Keeping the data consistent among the replicas is crucial in distributed data stores. One commonly applied technique is to assign one replica as MASTER and remaining replicas as SLAVE. All writes go to the MASTER and are then replicated to the SLAVE replicas.
+Now let\'s take a slightly more complicated scenario where a task represents a database.  Unlike an index which is in general read-only, a database supports both reads and writes. Keeping the data consistent among the replicas is crucial in distributed data stores. One commonly applied technique is to assign one replica as the MASTER and remaining replicas as SLAVEs. All writes go to the MASTER and are then replicated to the SLAVE replicas.
 
 Helix allows one to assign different states to each replica. Let\'s say you have two MySQL instances N1 and N2, where one will serve as MASTER and another as SLAVE. The IdealState can be changed to:
 
@@ -130,14 +130,14 @@ Helix allows one to assign different states to each replica. Let\'s say you have
 
 ### State Machine and Transitions
 
-IdealState allows one to exactly specify the desired state of the cluster. Given an IdealState, Helix takes up the responsibility of ensuring that the cluster reaches the IdealState.  The Helix _controller_ reads the IdealState and then commands the Participant to take appropriate actions to move from one state to another until it matches the IdealState.  These actions are referred to as _transitions_ in Helix.
+IdealState allows one to exactly specify the desired state of the cluster. Given an IdealState, Helix takes up the responsibility of ensuring that the cluster reaches the IdealState.  The Helix _controller_ reads the IdealState and then commands each Participant to take appropriate actions to move from one state to another until it matches the IdealState.  These actions are referred to as _transitions_ in Helix.
 
 The next logical question is:  how does the _controller_ compute the transitions required to get to IdealState?  This is where the finite state machine concept comes in. Helix allows applications to plug in a finite state machine.  A state machine consists of the following:
 
 * State: Describes the role of a replica
-* Transition: An action that allows a replica to move from one State to another, thus changing its role.
+* Transition: An action that allows a replica to move from one state to another, thus changing its role.
 
-Here is an example of MASTERSLAVE state machine,
+Here is an example of MasterSlave state machine:
 
 ```
           OFFLINE  | SLAVE  |  MASTER  
@@ -176,7 +176,7 @@ Helix allows each resource to be associated with one state machine. This means y
 
 ### Current State
 
-CurrentState of a resource simply represents its actual state at a PARTICIPANT. In the below example:
+CurrentState of a resource simply represents its actual state at a Participant. In the below example:
 
 * INSTANCE_NAME: Unique name representing the process
 * SESSION_ID: ID that is automatically assigned every time a process joins the cluster
@@ -206,7 +206,7 @@ Each node in the cluster has its own CurrentState.
 
 ### External View
 
-In order to communicate with the PARTICIPANTs, external clients need to know the current state of each of the PARTICIPANTs. The external clients are referred to as SPECTATORS. In order to make the life of SPECTATOR simple, Helix provides an EXTERNALVIEW that is an aggregated view of the current state across all nodes. The EXTERNALVIEW has a similar format as IDEALSTATE.
+In order to communicate with the Participants, external clients need to know the current state of each of the Participants. The external clients are referred to as Spectators. In order to make the life of Spectator simple, Helix provides an ExternalView that is an aggregated view of the current state across all nodes. The ExternalView has a similar format as IdealState.
 
 ```
 {
@@ -233,27 +233,27 @@ In order to communicate with the PARTICIPANTs, external clients need to know the
 
 ### Rebalancer
 
-The core component of Helix is the CONTROLLER which runs the REBALANCER algorithm on every cluster event. Cluster events can be one of the following:
+The core component of Helix is the Controller which runs the Rebalancer algorithm on every cluster event. Cluster events can be one of the following:
 
 * Nodes start/stop and soft/hard failures
 * New nodes are added/removed
 * Ideal state changes
 
-There are few more such as config changes, etc.  The key takeaway: there are many ways to trigger the rebalancer.
+There are few more examples such as configuration changes, etc.  The key takeaway: there are many ways to trigger the rebalancer.
 
 When a rebalancer is run it simply does the following:
 
 * Compares the IdealState and current state
 * Computes the transitions required to reach the IdealState
-* Issues the transitions to each PARTICIPANT
+* Issues the transitions to each Participant
 
-The above steps happen for every change in the system. Once the current state matches the IdealState, the system is considered stable which implies \'IDEALSTATE = CURRENTSTATE = EXTERNALVIEW\'
+The above steps happen for every change in the system. Once the current state matches the IdealState, the system is considered stable which implies \'IdealState = CurrentState = ExternalView\'
 
 ### Dynamic IdealState
 
 One of the things that makes Helix powerful is that IdealState can be changed dynamically. This means one can listen to cluster events like node failures and dynamically change the ideal state. Helix will then take care of triggering the respective transitions in the system.
 
-Helix comes with a few algorithms to automatically compute the IdealState based on the constraints. For example, if you have a resource of 3 partitions and 2 replicas, Helix can automatically compute the IdealState based on the nodes that are currently active. See the [tutorial](./tutorial_rebalance.html) to find out more about various execution modes of Helix like AUTO_REBALANCE, AUTO and CUSTOM. 
+Helix comes with a few algorithms to automatically compute the IdealState based on the constraints. For example, if you have a resource of 3 partitions and 2 replicas, Helix can automatically compute the IdealState based on the nodes that are currently active. See the [tutorial](./tutorial_rebalance.html) to find out more about various execution modes of Helix like FULL_AUTO, SEMI_AUTO and CUSTOMIZED. 
 
 
 

http://git-wip-us.apache.org/repos/asf/incubator-helix/blob/c5742650/src/site/markdown/Quickstart.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/Quickstart.md b/src/site/markdown/Quickstart.md
index dcffc1b..574f98b 100644
--- a/src/site/markdown/Quickstart.md
+++ b/src/site/markdown/Quickstart.md
@@ -138,7 +138,7 @@ Now you can run the same steps by hand.  In the detailed version, we\'ll do the
 * Expand the cluster: add a few nodes and rebalance the partitions
 * Failover: stop a node and verify the mastership transfer
 
-### Install/Start zookeeper
+### Install and Start Zookeeper
 
 Zookeeper can be started in standalone mode or replicated mode.
 
@@ -322,7 +322,7 @@ IdealState for myDB:
     "myDB_5" : [ "localhost_12914", "localhost_12915", "localhost_12913" ]
   },
   "simpleFields" : {
-    "IDEAL_STATE_MODE" : "AUTO",
+    "REBALANCE_MODE" : "SEMI_AUTO",
     "NUM_PARTITIONS" : "6",
     "REPLICAS" : "3",
     "STATE_MODEL_DEF_REF" : "MasterSlave",
@@ -450,7 +450,7 @@ IdealState for myDB:
     "myDB_5" : [ "localhost_12914", "localhost_12915", "localhost_12913" ]
   },
   "simpleFields" : {
-    "IDEAL_STATE_MODE" : "AUTO",
+    "REBALANCE_MODE" : "SEMI_AUTO",
     "NUM_PARTITIONS" : "6",
     "REPLICAS" : "3",
     "STATE_MODEL_DEF_REF" : "MasterSlave",
@@ -559,7 +559,7 @@ IdealState for myDB:
     "myDB_5" : [ "localhost_12914", "localhost_12915", "localhost_12913" ]
   },
   "simpleFields" : {
-    "IDEAL_STATE_MODE" : "AUTO",
+    "REBALANCE_MODE" : "SEMI_AUTO",
     "NUM_PARTITIONS" : "6",
     "REPLICAS" : "3",
     "STATE_MODEL_DEF_REF" : "MasterSlave",
@@ -608,7 +608,7 @@ ExternalView for myDB:
 
 As we\'ve seen in this Quickstart, Helix takes care of partitioning, load balancing, elasticity, failure detection and recovery.
 
-##### ZOOINSPECTOR
+##### ZooInspector
 
 You can view all of the underlying data by going direct to zookeeper.  Use ZooInspector that comes with zookeeper to browse the data. This is a java applet (make sure you have X windows)
 

http://git-wip-us.apache.org/repos/asf/incubator-helix/blob/c5742650/src/site/markdown/Tutorial.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/Tutorial.md b/src/site/markdown/Tutorial.md
index 27f9fd9..8e025b2 100644
--- a/src/site/markdown/Tutorial.md
+++ b/src/site/markdown/Tutorial.md
@@ -36,12 +36,14 @@ Convention: we first cover the _basic_ approach, which is the easiest to impleme
 2. [Spectator](./tutorial_spectator.html)
 3. [Controller](./tutorial_controller.html)
 4. [Rebalancing Algorithms](./tutorial_rebalance.html)
-5. [State Machines](./tutorial_state.html)
-6. [Messaging](./tutorial_messaging.html)
-7. [Customized health check](./tutorial_health.html)
-8. [Throttling](./tutorial_throttling.html)
-9. [Application Property Store](./tutorial_propstore.html)
-10. [Admin Interface](./tutorial_admin.html)
+5. [User-Defined Rebalancing](./tutorial_user_def_rebalancer.html)
+6. [State Machines](./tutorial_state.html)
+7. [Messaging](./tutorial_messaging.html)
+8. [Customized health check](./tutorial_health.html)
+9. [Throttling](./tutorial_throttling.html)
+10. [Application Property Store](./tutorial_propstore.html)
+11. [Admin Interface](./tutorial_admin.html)
+12. [YAML Cluster Setup](./tutorial_yaml.html)
 
 ### Preliminaries
 
@@ -180,9 +182,9 @@ Helix does this by assigning a STATE to a partition (such as MASTER, SLAVE), and
 
 There are 3 assignment modes Helix can operate on
 
-* AUTO_REBALANCE: Helix decides the placement and state of a partition.
-* AUTO: Application decides the placement but Helix decides the state of a partition.
-* CUSTOM: Application controls the placement and state of a partition.
+* FULL_AUTO: Helix decides the placement and state of a partition.
+* SEMI_AUTO: Application decides the placement but Helix decides the state of a partition.
+* CUSTOMIZED: Application controls the placement and state of a partition.
 
 For more info on the assignment modes, see [Rebalancing Algorithms](./tutorial_rebalance.html) of the tutorial.
 
@@ -190,7 +192,7 @@ For more info on the assignment modes, see [Rebalancing Algorithms](./tutorial_r
     String RESOURCE_NAME = "MyDB";
     int NUM_PARTITIONS = 6;
     STATE_MODEL_NAME = "MasterSlave";
-    String MODE = "AUTO";
+    String MODE = "SEMI_AUTO";
     int NUM_REPLICAS = 2;
 
     admin.addResource(CLUSTER_NAME, RESOURCE_NAME, NUM_PARTITIONS, STATE_MODEL_NAME, MODE);

http://git-wip-us.apache.org/repos/asf/incubator-helix/blob/c5742650/src/site/markdown/index.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/index.md b/src/site/markdown/index.md
index 57985d0..163f559 100644
--- a/src/site/markdown/index.md
+++ b/src/site/markdown/index.md
@@ -46,6 +46,8 @@ Navigating the Documentation
 
 [Distributed Task DAG Execution](./recipes/task_dag_execution.html)
 
+[User-Defined Rebalancer Example](./recipes/user_def_rebalancer.html)
+
 
 What Is Helix
 --------------
@@ -54,45 +56,46 @@ Helix is a generic _cluster management_ framework used for the automatic managem
 
 What Is Cluster Management
 --------------------------
-To understand Helix, first you need to understand what is _cluster management_.  A distributed system typically runs on multiple nodes for the following reasons:
+To understand Helix, first you need to understand _cluster management_.  A distributed system typically runs on multiple nodes for the following reasons:
 
 * scalability
 * fault tolerance
 * load balancing
 
-Each node performs one or more of the primary function of the cluster, such as storing/serving data, producing/consuming data streams, etc.  Once configured for your system, Helix acts as the global brain for the system.  It is designed to make decisions that cannot be made in isolation.  Examples of decisions that require global knowledge and coordination:
+Each node performs one or more of the primary function of the cluster, such as storing and serving data, producing and consuming data streams, and so on.  Once configured for your system, Helix acts as the global brain for the system.  It is designed to make decisions that cannot be made in isolation.  Examples of such decisions that require global knowledge and coordination:
 
 * scheduling of maintainence tasks, such as backups, garbage collection, file consolidation, index rebuilds
 * repartitioning of data or resources across the cluster
 * informing dependent systems of changes so they can react appropriately to cluster changes
 * throttling system tasks and changes
 
-While it is possible to integrate these functions into the distributed system, it complicates the code.  Helix has abstracted common cluster management tasks, enabling the system builder to model the desired behavior in a declarative state model, and let Helix manage the coordination.  The result is less new code to write, and a robust, highly operable system.
+While it is possible to integrate these functions into the distributed system, it complicates the code.  Helix has abstracted common cluster management tasks, enabling the system builder to model the desired behavior with a declarative state model, and let Helix manage the coordination.  The result is less new code to write, and a robust, highly operable system.
 
 
 Key Features of Helix
 ---------------------
-1. Automatic assignment of resource/partition to nodes
+1. Automatic assignment of resources and partitions to nodes
 2. Node failure detection and recovery
-3. Dynamic addition of Resources 
+3. Dynamic addition of resources 
 4. Dynamic addition of nodes to the cluster
 5. Pluggable distributed state machine to manage the state of a resource via state transitions
-6. Automatic load balancing and throttling of transitions 
+6. Automatic load balancing and throttling of transitions
+7. Optional pluggable rebalancing for user-defined assignment of resources and partitions
 
 
 Why Helix
 ---------
-Modeling a distributed system as a state machine with constraints on state and transitions has the following benefits:
+Modeling a distributed system as a state machine with constraints on states and transitions has the following benefits:
 
-* Separates cluster management from the core functionality.
-* Quick transformation from a single node system to an operable, distributed system.
-* Simplicity: System components do not have to manage global cluster.  This division of labor makes it easier to build, debug, and maintain your system.
+* Separates cluster management from the core functionality of the system.
+* Allows a quick transformation from a single node system to an operable, distributed system.
+* Increases simplicity: system components do not have to manage a global cluster.  This division of labor makes it easier to build, debug, and maintain your system.
 
 
 Build Instructions
 ------------------
 
-Requirements: Jdk 1.6+, Maven 2.0.8+
+Requirements: JDK 1.6+, Maven 2.0.8+
 
 ```
     git clone https://git-wip-us.apache.org/repos/asf/incubator-helix.git

http://git-wip-us.apache.org/repos/asf/incubator-helix/blob/c5742650/src/site/markdown/recipes/lock_manager.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/recipes/lock_manager.md b/src/site/markdown/recipes/lock_manager.md
index 84420dd..252ace7 100644
--- a/src/site/markdown/recipes/lock_manager.md
+++ b/src/site/markdown/recipes/lock_manager.md
@@ -137,7 +137,7 @@ This provides more details on how to setup the cluster and where to plugin appli
 Create a lock group and specify the number of locks in the lock group. 
 
 ```
-./helix-admin --zkSvr localhost:2199  --addResource lock-manager-demo lock-group 6 OnlineOffline AUTO_REBALANCE
+./helix-admin --zkSvr localhost:2199  --addResource lock-manager-demo lock-group 6 OnlineOffline FULL_AUTO
 ```
 
 ##### Start the nodes

http://git-wip-us.apache.org/repos/asf/incubator-helix/blob/c5742650/src/site/markdown/recipes/rabbitmq_consumer_group.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/recipes/rabbitmq_consumer_group.md b/src/site/markdown/recipes/rabbitmq_consumer_group.md
index ec3053a..9edc2cb 100644
--- a/src/site/markdown/recipes/rabbitmq_consumer_group.md
+++ b/src/site/markdown/recipes/rabbitmq_consumer_group.md
@@ -148,7 +148,7 @@ Cluster setup
 -------------
 This step creates znode on zookeeper for the cluster and adds the state model. We use online offline state model since there is no need for other states. The consumer is either processing a queue or it is not.
 
-It creates a resource called "rabbitmq-consumer-group" with 6 partitions. The execution mode is set to AUTO_REBALANCE. This means that the Helix controls the assignment of partition to consumers and automatically distributes the partitions evenly among the active consumers. When a consumer is added or removed, it ensures that a minimum number of partitions are shuffled.
+It creates a resource called "rabbitmq-consumer-group" with 6 partitions. The execution mode is set to FULL_AUTO. This means that the Helix controls the assignment of partition to consumers and automatically distributes the partitions evenly among the active consumers. When a consumer is added or removed, it ensures that a minimum number of partitions are shuffled.
 
 ```
       zkclient = new ZkClient(zkAddr, ZkClient.DEFAULT_SESSION_TIMEOUT,
@@ -165,7 +165,7 @@ It creates a resource called "rabbitmq-consumer-group" with 6 partitions. The ex
 
       // add resource "topic" which has 6 partitions
       String resourceName = "rabbitmq-consumer-group";
-      admin.addResource(clusterName, resourceName, 6, "OnlineOffline", "AUTO_REBALANCE");
+      admin.addResource(clusterName, resourceName, 6, "OnlineOffline", "FULL_AUTO");
 ```
 
 Starting the consumers

http://git-wip-us.apache.org/repos/asf/incubator-helix/blob/c5742650/src/site/markdown/recipes/user_def_rebalancer.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/recipes/user_def_rebalancer.md b/src/site/markdown/recipes/user_def_rebalancer.md
new file mode 100644
index 0000000..8beac0a
--- /dev/null
+++ b/src/site/markdown/recipes/user_def_rebalancer.md
@@ -0,0 +1,287 @@
+<!---
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+Lock Manager with a User-Defined Rebalancer
+-------------------------------------------
+Helix is able to compute node preferences and state assignments automatically using general-purpose algorithms. In many cases, a distributed system implementer may choose to instead define a customized approach to computing the location of replicas, the state mapping, or both in response to the addition or removal of participants. The following is an implementation of the [Distributed Lock Manager](./lock_manager.html) that includes a user-defined rebalancer.
+
+### Define the cluster and locks
+
+The YAML file below fully defines the cluster and the locks. A lock can be in one of two states: locked and unlocked. Transitions can happen in either direction, and the locked is preferred. A resource in this example is the entire collection of locks to distribute. A partition is mapped to a lock; in this case that means there are 12 locks. These 12 locks will be distributed across 3 nodes. The constraints indicate that only one replica of a lock can be in the locked state at any given time. These locks can each only have a single holder, defined by a replica count of 1.
+
+Notice the rebalancer section of the definition. The mode is set to USER_DEFINED and the class name refers to the plugged-in rebalancer implementation. This implementation is called whenever the state of the cluster changes, as is the case when participants are added or removed from the system.
+
+Location: incubator-helix/recipes/user-rebalanced-lock-manager/src/main/resources/lock-manager-config.yaml
+
+```
+clusterName: lock-manager-custom-rebalancer # unique name for the cluster
+resources:
+  - name: lock-group # unique resource name
+    rebalancer: # we will provide our own rebalancer
+      mode: USER_DEFINED
+      class: org.apache.helix.userrebalancedlocks.LockManagerRebalancer
+    partitions:
+      count: 12 # number of locks
+      replicas: 1 # number of simultaneous holders for each lock
+    stateModel:
+      name: lock-unlock # unique model name
+      states: [LOCKED, RELEASED, DROPPED] # the list of possible states
+      transitions: # the list of possible transitions
+        - name: Unlock
+          from: LOCKED
+          to: RELEASED
+        - name: Lock
+          from: RELEASED
+          to: LOCKED
+        - name: DropLock
+          from: LOCKED
+          to: DROPPED
+        - name: DropUnlock
+          from: RELEASED
+          to: DROPPED
+        - name: Undrop
+          from: DROPPED
+          to: RELEASED
+      initialState: RELEASED
+    constraints:
+      state:
+        counts: # maximum number of replicas of a partition that can be in each state
+          - name: LOCKED
+            count: "1"
+          - name: RELEASED
+            count: "-1"
+          - name: DROPPED
+            count: "-1"
+        priorityList: [LOCKED, RELEASED, DROPPED] # states in order of priority
+      transition: # transitions priority to enforce order that transitions occur
+        priorityList: [Unlock, Lock, Undrop, DropUnlock, DropLock]
+participants: # list of nodes that can acquire locks
+  - name: localhost_12001
+    host: localhost
+    port: 12001
+  - name: localhost_12002
+    host: localhost
+    port: 12002
+  - name: localhost_12003
+    host: localhost
+    port: 12003
+```
+
+Then, Helix\'s YAMLClusterSetup tool can read in the configuration and bootstrap the cluster immediately:
+
+```
+YAMLClusterSetup setup = new YAMLClusterSetup(zkAddress);
+InputStream input =
+    Thread.currentThread().getContextClassLoader()
+        .getResourceAsStream("lock-manager-config.yaml");
+YAMLClusterSetup.YAMLClusterConfig config = setup.setupCluster(input);
+```
+
+### Write a rebalancer
+Below is a full implementation of a rebalancer. In this case, it simply throws out the previous ideal state, computes the target node for as many partition replicas as can hold a lock in the LOCKED state (in this example, one), and assigns them the LOCKED state (which is at the head of the state preference list). Clearly a more robust implementation would likely examine the current ideal state to maintain current assignments, and the full state list to handle models more complicated than this one. However, for a simple lock holder implementation, this is sufficient.
+
+Location: incubator-helix/recipes/user-rebalanced-lock-manager/src/main/java/org/apache/helix/userdefinedrebalancer/LockManagerRebalancer.java
+
+```
+public class LockManagerRebalancer implements Rebalancer {
+  @Override
+  public void init(HelixManager manager) {
+    // do nothing; this rebalancer is independent of the manager
+  }
+
+  @Override
+  public ResourceAssignment computeResourceMapping(Resource resource, IdealState currentIdealState,
+      CurrentStateOutput currentStateOutput, ClusterDataCache clusterData) {
+    // Initialize an empty mapping of locks to participants
+    ResourceAssignment assignment = new ResourceAssignment(resource.getResourceName());
+
+    // Get the list of live participants in the cluster
+    List<String> liveParticipants = new ArrayList<String>(clusterData.getLiveInstances().keySet());
+
+    // Get the state model (should be a simple lock/unlock model) and the highest-priority state
+    String stateModelName = currentIdealState.getStateModelDefRef();
+    StateModelDefinition stateModelDef = clusterData.getStateModelDef(stateModelName);
+    if (stateModelDef.getStatesPriorityList().size() < 1) {
+      LOG.error("Invalid state model definition. There should be at least one state.");
+      return assignment;
+    }
+    String lockState = stateModelDef.getStatesPriorityList().get(0);
+
+    // Count the number of participants allowed to lock each lock
+    String stateCount = stateModelDef.getNumInstancesPerState(lockState);
+    int lockHolders = 0;
+    try {
+      // a numeric value is a custom-specified number of participants allowed to lock the lock
+      lockHolders = Integer.parseInt(stateCount);
+    } catch (NumberFormatException e) {
+      LOG.error("Invalid state model definition. The lock state does not have a valid count");
+      return assignment;
+    }
+
+    // Fairly assign the lock state to the participants using a simple mod-based sequential
+    // assignment. For instance, if each lock can be held by 3 participants, lock 0 would be held
+    // by participants (0, 1, 2), lock 1 would be held by (1, 2, 3), and so on, wrapping around the
+    // number of participants as necessary.
+    // This assumes a simple lock-unlock model where the only state of interest is which nodes have
+    // acquired each lock.
+    int i = 0;
+    for (Partition partition : resource.getPartitions()) {
+      Map<String, String> replicaMap = new HashMap<String, String>();
+      for (int j = i; j < i + lockHolders; j++) {
+        int participantIndex = j % liveParticipants.size();
+        String participant = liveParticipants.get(participantIndex);
+        // enforce that a participant can only have one instance of a given lock
+        if (!replicaMap.containsKey(participant)) {
+          replicaMap.put(participant, lockState);
+        }
+      }
+      assignment.addReplicaMap(partition, replicaMap);
+      i++;
+    }
+    return assignment;
+  }
+}
+```
+
+### Start up the participants
+Here is a lock class based on the newly defined lock-unlock state model so that the participant can receive callbacks on state transitions.
+
+Location: incubator-helix/recipes/user-rebalanced-lock-manager/src/main/java/org/apache/helix/userdefinedrebalancer/Lock.java
+
+```
+public class Lock extends StateModel {
+  private String lockName;
+
+  public Lock(String lockName) {
+    this.lockName = lockName;
+  }
+
+  @Transition(from = "RELEASED", to = "LOCKED")
+  public void lock(Message m, NotificationContext context) {
+    System.out.println(context.getManager().getInstanceName() + " acquired lock:" + lockName);
+  }
+
+  @Transition(from = "LOCKED", to = "RELEASED")
+  public void release(Message m, NotificationContext context) {
+    System.out.println(context.getManager().getInstanceName() + " releasing lock:" + lockName);
+  }
+}
+```
+
+Here is the factory to make the Lock class accessible.
+
+Location: incubator-helix/recipes/user-rebalanced-lock-manager/src/main/java/org/apache/helix/userdefinedrebalancer/LockFactory.java
+
+```
+public class LockFactory extends StateModelFactory<Lock> {
+  @Override
+  public Lock createNewStateModel(String lockName) {
+    return new Lock(lockName);
+  }
+}
+```
+
+Finally, here is the factory registration and the start of the participant:
+
+```
+participantManager =
+    HelixManagerFactory.getZKHelixManager(clusterName, participantName, InstanceType.PARTICIPANT,
+        zkAddress);
+participantManager.getStateMachineEngine().registerStateModelFactory(stateModelName,
+    new LockFactory());
+participantManager.connect();
+```
+
+### Start up the controller
+
+```
+controllerManager =
+    HelixControllerMain.startHelixController(zkAddress, config.clusterName, "controller",
+        HelixControllerMain.STANDALONE);
+```
+
+### Try it out
+#### Building 
+```
+git clone https://git-wip-us.apache.org/repos/asf/incubator-helix.git
+cd incubator-helix
+mvn clean install package -DskipTests
+cd recipes/user-rebalanced-lock-manager/target/user-rebalanced-lock-manager-pkg/bin
+chmod +x *
+./lock-manager-demo.sh
+```
+
+#### Output
+
+```
+./lock-manager-demo 
+STARTING localhost_12002
+STARTING localhost_12001
+STARTING localhost_12003
+STARTED localhost_12001
+STARTED localhost_12003
+STARTED localhost_12002
+localhost_12003 acquired lock:lock-group_4
+localhost_12002 acquired lock:lock-group_8
+localhost_12001 acquired lock:lock-group_10
+localhost_12001 acquired lock:lock-group_3
+localhost_12001 acquired lock:lock-group_6
+localhost_12003 acquired lock:lock-group_0
+localhost_12002 acquired lock:lock-group_5
+localhost_12001 acquired lock:lock-group_9
+localhost_12002 acquired lock:lock-group_2
+localhost_12003 acquired lock:lock-group_7
+localhost_12003 acquired lock:lock-group_11
+localhost_12002 acquired lock:lock-group_1
+lockName  acquired By
+======================================
+lock-group_0  localhost_12003
+lock-group_1  localhost_12002
+lock-group_10 localhost_12001
+lock-group_11 localhost_12003
+lock-group_2  localhost_12002
+lock-group_3  localhost_12001
+lock-group_4  localhost_12003
+lock-group_5  localhost_12002
+lock-group_6  localhost_12001
+lock-group_7  localhost_12003
+lock-group_8  localhost_12002
+lock-group_9  localhost_12001
+Stopping the first participant
+localhost_12001 Interrupted
+localhost_12002 acquired lock:lock-group_3
+localhost_12003 acquired lock:lock-group_6
+localhost_12003 acquired lock:lock-group_10
+localhost_12002 acquired lock:lock-group_9
+lockName  acquired By
+======================================
+lock-group_0  localhost_12003
+lock-group_1  localhost_12002
+lock-group_10 localhost_12003
+lock-group_11 localhost_12003
+lock-group_2  localhost_12002
+lock-group_3  localhost_12002
+lock-group_4  localhost_12003
+lock-group_5  localhost_12002
+lock-group_6  localhost_12003
+lock-group_7  localhost_12003
+lock-group_8  localhost_12002
+lock-group_9  localhost_12002
+```
+
+Notice that the lock assignment directly follows the assignment generated by the user-defined rebalancer both initially and after a participant is removed from the system.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/incubator-helix/blob/c5742650/src/site/markdown/tutorial_controller.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/tutorial_controller.md b/src/site/markdown/tutorial_controller.md
index 17cd532..e391673 100644
--- a/src/site/markdown/tutorial_controller.md
+++ b/src/site/markdown/tutorial_controller.md
@@ -83,7 +83,7 @@ If setting up a separate controller process is not viable, then it is possible t
 
 #### CONTROLLER AS A SERVICE
 
-One of the cool feature we added in Helix was to use a set of controllers to manage a large number of clusters. 
+One of the cool features we added in Helix is to use a set of controllers to manage a large number of clusters. 
 
 For example if you have X clusters to be managed, instead of deploying X*3 (3 controllers for fault tolerance) controllers for each cluster, one can deploy just 3 controllers.  Each controller can manage X/3 clusters.  If any controller fails, the remaining two will manage X/2 clusters.
 

http://git-wip-us.apache.org/repos/asf/incubator-helix/blob/c5742650/src/site/markdown/tutorial_messaging.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/tutorial_messaging.md b/src/site/markdown/tutorial_messaging.md
index f3fef10..c6fd3b2 100644
--- a/src/site/markdown/tutorial_messaging.md
+++ b/src/site/markdown/tutorial_messaging.md
@@ -25,10 +25,10 @@ In this chapter, we\'ll learn about messaging, a convenient feature in Helix for
 
 Consider a search system  where the index replica starts up and it does not have an index. A typical solution is to get the index from a common location, or to copy the index from another replica.
 
-Helix provides a messaging api for intra-cluster communication between nodes in the system.  Helix provides a mechanism to specify the message recipient in terms of resource, partition, and state rather than specifying hostnames.  Helix ensures that the message is delivered to all of the required recipients. In this particular use case, the instance can specify the recipient criteria as all replicas of the desired partition to bootstrap.
+Helix provides a messaging API for intra-cluster communication between nodes in the system.  Helix provides a mechanism to specify the message recipient in terms of resource, partition, and state rather than specifying hostnames.  Helix ensures that the message is delivered to all of the required recipients. In this particular use case, the instance can specify the recipient criteria as all replicas of the desired partition to bootstrap.
 Since Helix is aware of the global state of the system, it can send the message to appropriate nodes. Once the nodes respond, Helix provides the bootstrapping replica with all the responses.
 
-This is a very generic api and can also be used to schedule various periodic tasks in the cluster, such as data backups, log cleanup, etc.
+This is a very generic API and can also be used to schedule various periodic tasks in the cluster, such as data backups, log cleanup, etc.
 System Admins can also perform ad-hoc tasks, such as on-demand backups or a system command (such as rm -rf ;) across all nodes of the cluster
 
 ```

http://git-wip-us.apache.org/repos/asf/incubator-helix/blob/c5742650/src/site/markdown/tutorial_participant.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/tutorial_participant.md b/src/site/markdown/tutorial_participant.md
index 19e6f98..cd4bcd2 100644
--- a/src/site/markdown/tutorial_participant.md
+++ b/src/site/markdown/tutorial_participant.md
@@ -19,7 +19,7 @@ under the License.
 
 # Helix Tutorial: Participant
 
-In this chapter, we\'ll learn how to implement a PARTICIPANT, which is a primary functional component of a distributed system.
+In this chapter, we\'ll learn how to implement a Participant, which is a primary functional component of a distributed system.
 
 
 ### Start the Helix agent
@@ -43,6 +43,7 @@ The methods of the State Model will be called when controller sends transitions
 * MasterSlaveStateModelFactory
 * LeaderStandbyStateModelFactory
 * BootstrapHandler
+* _An application defined state model factory_
 
 
 ```
@@ -58,7 +59,7 @@ The methods of the State Model will be called when controller sends transitions
      manager.connect();
 ```
 
-Helix doesn\'t know what it means to change from OFFLIN\-\-\>ONLINE or ONLINE\-\-\>OFFLINE.  The following code snippet shows where you insert your system logic for these two state transitions.
+Helix doesn\'t know what it means to change from OFFLINE\-\-\>ONLINE or ONLINE\-\-\>OFFLINE.  The following code snippet shows where you insert your system logic for these two state transitions.
 
 ```
 public class OnlineOfflineStateModelFactory extends

http://git-wip-us.apache.org/repos/asf/incubator-helix/blob/c5742650/src/site/markdown/tutorial_rebalance.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/tutorial_rebalance.md b/src/site/markdown/tutorial_rebalance.md
index 1f5930d..f8f0511 100644
--- a/src/site/markdown/tutorial_rebalance.md
+++ b/src/site/markdown/tutorial_rebalance.md
@@ -19,7 +19,7 @@ under the License.
 
 # Helix Tutorial: Rebalancing Algorithms
 
-The placement of partitions in a distributed system is essential for the reliability and scalability of the system.  For example, when a node fails, it is important that the partitions hosted on that node are reallocated evenly among the remaining nodes. Consistent hashing is one such algorithm that can satisfy this guarantee.  Helix provides a variant of consistent hashing based on the RUSH algorithm.
+The placement of partitions in a distributed system is essential for the reliability and scalability of the system.  For example, when a node fails, it is important that the partitions hosted on that node are reallocated evenly among the remaining nodes. Consistent hashing is one such algorithm that can satisfy this guarantee.  Helix provides a variant of consistent hashing based on the RUSH algorithm, among others.
 
 This means given a number of partitions, replicas and number of nodes, Helix does the automatic assignment of partition to nodes such that:
 
@@ -32,25 +32,26 @@ Helix employs a rebalancing algorithm to compute the _ideal state_ of the system
 
 Helix makes it easy to perform this operation, while giving you control over the algorithm.  In this section, we\'ll see how to implement the desired behavior.
 
-Helix has three options for rebalancing, in increasing order of customization by the system builder:
+Helix has four options for rebalancing, in increasing order of customization by the system builder:
 
-* AUTO_REBALANCE
-* AUTO
-* CUSTOM
+* FULL_AUTO
+* SEMI_AUTO
+* CUSTOMIZED
+* USER_DEFINED
 
 ```
-            |AUTO REBALANCE|   AUTO     |   CUSTOM  |       
-            -----------------------------------------
-   LOCATION | HELIX        |  APP       |  APP      |
-            -----------------------------------------
-      STATE | HELIX        |  HELIX     |  APP      |
-            -----------------------------------------
+            |FULL_AUTO     |  SEMI_AUTO | CUSTOMIZED|  USER_DEFINED  |
+            ---------------------------------------------------------|
+   LOCATION | HELIX        |  APP       |  APP      |      APP       |
+            ---------------------------------------------------------|
+      STATE | HELIX        |  HELIX     |  APP      |      APP       |
+            ----------------------------------------------------------
 ```
 
 
-### AUTO_REBALANCE
+### FULL_AUTO
 
-When the idealstate mode is set to AUTO_REBALANCE, Helix controls both the location of the replica along with the state. This option is useful for applications where creation of a replica is not expensive. 
+When the rebalance mode is set to FULL_AUTO, Helix controls both the location of the replica along with the state. This option is useful for applications where creation of a replica is not expensive. 
 
 For example, consider this system that uses a MasterSlave state model, with 3 partitions and 2 replicas in the ideal state.
 
@@ -58,7 +59,7 @@ For example, consider this system that uses a MasterSlave state model, with 3 pa
 {
   "id" : "MyResource",
   "simpleFields" : {
-    "IDEAL_STATE_MODE" : "AUTO_REBALANCE",
+    "REBALANCE_MODE" : "FULL_AUTO",
     "NUM_PARTITIONS" : "3",
     "REPLICAS" : "2",
     "STATE_MODEL_DEF_REF" : "MasterSlave",
@@ -103,9 +104,9 @@ If there are 3 nodes in the cluster, then Helix will balance the masters and sla
 Another typical example is evenly distributing a group of tasks among the currently healthy processes. For example, if there are 60 tasks and 4 nodes, Helix assigns 15 tasks to each node. 
 When one node fails, Helix redistributes its 15 tasks to the remaining 3 nodes, resulting in a balanced 20 tasks per node. Similarly, if a node is added, Helix re-allocates 3 tasks from each of the 4 nodes to the 5th node, resulting in a balanced distribution of 12 tasks per node.. 
 
-#### AUTO
+#### SEMI_AUTO
 
-When the application needs to control the placement of the replicas, use the AUTO idealstate mode.
+When the application needs to control the placement of the replicas, use the SEMI_AUTO rebalance mode.
 
 Example: In the ideal state below, the partition \'MyResource_0\' is constrained to be placed only on node1 or node2.  The choice of _state_ is still controlled by Helix.  That means MyResource_0.MASTER could be on node1 and MyResource_0.SLAVE on node2, or vice-versa but neither would be placed on node3.
 
@@ -113,7 +114,7 @@ Example: In the ideal state below, the partition \'MyResource_0\' is constrained
 {
   "id" : "MyResource",
   "simpleFields" : {
-    "IDEAL_STATE_MODE" : "AUTO",
+    "REBALANCE_MODE" : "SEMI_AUTO",
     "NUM_PARTITIONS" : "3",
     "REPLICAS" : "2",
     "STATE_MODEL_DEF_REF" : "MasterSlave",
@@ -130,11 +131,11 @@ Example: In the ideal state below, the partition \'MyResource_0\' is constrained
 
 The MasterSlave state model requires that a partition has exactly one MASTER at all times, and the other replicas should be SLAVEs.  In this simple example with 2 replicas per partition, there would be one MASTER and one SLAVE.  Upon failover, a SLAVE has to assume mastership, and a new SLAVE will be generated.
 
-In this mode when node1 fails, unlike in AUTO-REBALANCE mode the partition is _not_ moved from node1 to node3. Instead, Helix will decide to change the state of MyResource_0 on node2 from SLAVE to MASTER, based on the system constraints. 
+In this mode when node1 fails, unlike in FULL_AUTO mode the partition is _not_ moved from node1 to node3. Instead, Helix will decide to change the state of MyResource_0 on node2 from SLAVE to MASTER, based on the system constraints. 
 
-#### CUSTOM
+#### CUSTOMIZED
 
-Finally, Helix offers a third mode called CUSTOM, in which the application controls the placement _and_ state of each replica. The application needs to implement a callback interface that Helix invokes when the cluster state changes. 
+Helix offers a third mode called CUSTOMIZED, in which the application controls the placement _and_ state of each replica. The application needs to implement a callback interface that Helix invokes when the cluster state changes. 
 Within this callback, the application can recompute the idealstate. Helix will then issue appropriate transitions such that _Idealstate_ and _Currentstate_ converges.
 
 Here\'s an example, again with 3 partitions, 2 replicas per partition, and the MasterSlave state model:
@@ -143,7 +144,7 @@ Here\'s an example, again with 3 partitions, 2 replicas per partition, and the M
 {
   "id" : "MyResource",
   "simpleFields" : {
-      "IDEAL_STATE_MODE" : "CUSTOM",
+    "REBALANCE_MODE" : "CUSTOMIZED",
     "NUM_PARTITIONS" : "3",
     "REPLICAS" : "2",
     "STATE_MODEL_DEF_REF" : "MasterSlave",
@@ -166,3 +167,11 @@ Here\'s an example, again with 3 partitions, 2 replicas per partition, and the M
 ```
 
 Suppose the current state of the system is 'MyResource_0' -> {N1:MASTER, N2:SLAVE} and the application changes the ideal state to 'MyResource_0' -> {N1:SLAVE,N2:MASTER}. While the application decides which node is MASTER and which is SLAVE, Helix will not blindly issue MASTER-->SLAVE to N1 and SLAVE-->MASTER to N2 in parallel, since that might result in a transient state where both N1 and N2 are masters, which violates the MasterSlave constraint that there is exactly one MASTER at a time.  Helix will first issue MASTER-->SLAVE to N1 and after it is completed, it will issue SLAVE-->MASTER to N2. 
+
+#### USER_DEFINED
+
+For maximum flexibility, Helix exposes an interface that can allow applications to plug in custom rebalancing logic. By providing the name of a class that implements the Rebalancer interface, Helix will automatically call the contained method whenever there is a change to the live participants in the cluster. For more, see [User-Defined Rebalancer](./tutorial_user_def_rebalancer.html).
+
+#### Backwards Compatibility
+
+In previous versions, FULL_AUTO was called AUTO_REBALANCE and SEMI_AUTO was called AUTO. Furthermore, they were presented as the IDEAL_STATE_MODE. Helix supports both IDEAL_STATE_MODE and REBALANCE_MODE, but IDEAL_STATE_MODE is now deprecated and may be phased out in future versions.

http://git-wip-us.apache.org/repos/asf/incubator-helix/blob/c5742650/src/site/markdown/tutorial_spectator.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/tutorial_spectator.md b/src/site/markdown/tutorial_spectator.md
index a5b9a0e..bdf50a7 100644
--- a/src/site/markdown/tutorial_spectator.md
+++ b/src/site/markdown/tutorial_spectator.md
@@ -19,11 +19,11 @@ under the License.
 
 # Helix Tutorial: Spectator
 
-Next, we\'ll learn how to implement a SPECTATOR.  Typically, a spectator needs to react to changes within the distributed system.  Examples: a client that needs to know where to send a request, a topic consumer in a consumer group.  The spectator is automatically informed of changes in the _external state_ of the cluster, but it does not have to add any code to keep track of other components in the system.
+Next, we\'ll learn how to implement a Spectator.  Typically, a spectator needs to react to changes within the distributed system.  Examples: a client that needs to know where to send a request, a topic consumer in a consumer group.  The spectator is automatically informed of changes in the _external state_ of the cluster, but it does not have to add any code to keep track of other components in the system.
 
 ### Start the Helix agent
 
-Same as for a PARTICIPANT, The Helix agent is the common component that connects each system component with the controller.
+Same as for a Participant, The Helix agent is the common component that connects each system component with the controller.
 
 It requires the following parameters:
 

http://git-wip-us.apache.org/repos/asf/incubator-helix/blob/c5742650/src/site/markdown/tutorial_user_def_rebalancer.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/tutorial_user_def_rebalancer.md b/src/site/markdown/tutorial_user_def_rebalancer.md
new file mode 100644
index 0000000..6d07878
--- /dev/null
+++ b/src/site/markdown/tutorial_user_def_rebalancer.md
@@ -0,0 +1,196 @@
+<!---
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+# Tutorial: User-Defined Rebalancing
+
+Even though Helix can compute both the location and the state of replicas internally using a default fully-automatic rebalancer, specific applications may require rebalancing strategies that optimize for different requirements. Thus, Helix allows applications to plug in arbitrary rebalancer algorithms that implement a provided interface. One of the main design goals of Helix is to provide maximum flexibility to any distributed application. Thus, it allows applications to fully implement the rebalancer, which is the core constraint solver in the system, if the application developer so chooses.
+
+Whenever the state of the cluster changes, as is the case when participants join or leave the cluster, Helix automatically calls the rebalancer to compute a new mapping of all the replicas in the resource. When using a pluggable rebalancer, the only required step is to register it with Helix. Subsequently, no additional bootstrapping steps are necessary. Helix uses reflection to look up and load the class dynamically at runtime. As a result, it is also technically possible to change the rebalancing strategy used at any time.
+
+The Rebalancer interface is as follows:
+
+```
+ResourceMapping computeResourceMapping(final Resource resource,
+      final IdealState currentIdealState, final CurrentStateOutput currentStateOutput,
+      final ClusterDataCache clusterData);
+```
+The first parameter is the resource to rebalance, the second is pre-existing ideal mappings, the third is a snapshot of the actual placements and state assignments, and the fourth is a full cache of all of the cluster data available to Helix. Internally, Helix implements the same interface for its own rebalancing routines, so a user-defined rebalancer will be cognizant of the same information about the cluster as an internal implementation. Helix strives to provide applications the ability to implement algorithms that may require a large portion of the entire state of the cluster to make the best placement and state assignment decisions possible.
+
+A ResourceMapping is a full representation of the location and the state of each replica of each partition of a given resource. This is a simple representation of the placement that the algorithm believes is the best possible. If the placement meets all defined constraints, this is what will become the actual state of the distributed system.
+
+### Specifying a Rebalancer
+For implementations that set up the cluster through existing code, the following HelixAdmin calls will update the Rebalancer class:
+
+```
+IdealState idealState = helixAdmin.getResourceIdealState(clusterName, resourceName);
+idealState.setRebalanceMode(RebalanceMode.USER_DEFINED);
+idealState.setRebalancerClassName(className);
+helixAdmin.setResourceIdealState(clusterName, resourceName, idealState);
+```
+There are two key fields to set to specify that a pluggable rebalancer should be used. First, the rebalance mode should be set to USER_DEFINED, and second the rebalancer class name should be set to a class that implements Rebalancer and is within the scope of the project. The class name is a fully-qualified class name consisting of its package and its name. Without specification of the USER_DEFINED mode, the user-defined rebalancer class will not be used even if specified. Furthermore, Helix will not attempt to rebalance the resources through its standard routines if its mode is USER_DEFINED, regardless of whether or not a rebalancer class is registered.
+
+Alternatively, the rebalancer class name can be specified in a YAML file representing the cluster configuration. The requirements are the same, but the representation is more compact. Below are the first few lines of an example YAML file. To see a full YAML specification, see the [YAML tutorial](./tutorial_yaml.html).
+
+```
+clusterName: lock-manager-custom-rebalancer # unique name for the cluster
+resources:
+  - name: lock-group # unique resource name
+    rebalancer: # we will provide our own rebalancer
+      mode: USER_DEFINED
+      class: domain.project.helix.rebalancer.UserDefinedRebalancerClass
+...
+```
+
+### Example
+We demonstrate plugging in a simple user-defined rebalancer as part of a revisit of the [distributed lock manager](./recipes/user_def_rebalancer.html) example. It includes a functional Rebalancer implementation, as well as the entire YAML file used to define the cluster.
+
+Consider the case where partitions are locks in a lock manager and 6 locks are to be distributed evenly to a set of participants, and only one participant can hold each lock. We can define a rebalancing algorithm that simply takes the modulus of the lock number and the number of participants to evenly distribute the locks across participants. Helix allows capping the number of partitions a participant can accept, but since locks are lightweight, we do not need to define a restriction in this case. The following is a succinct implementation of this algorithm.
+
+```
+@Override
+public ResourceAssignment computeResourceMapping(Resource resource, IdealState currentIdealState,
+    CurrentStateOutput currentStateOutput, ClusterDataCache clusterData) {
+  // Initialize an empty mapping of locks to participants
+  ResourceAssignment assignment = new ResourceAssignment(resource.getResourceName());
+
+  // Get the list of live participants in the cluster
+  List<String> liveParticipants = new ArrayList<String>(clusterData.getLiveInstances().keySet());
+
+  // Get the state model (should be a simple lock/unlock model) and the highest-priority state
+  String stateModelName = currentIdealState.getStateModelDefRef();
+  StateModelDefinition stateModelDef = clusterData.getStateModelDef(stateModelName);
+  if (stateModelDef.getStatesPriorityList().size() < 1) {
+    LOG.error("Invalid state model definition. There should be at least one state.");
+    return assignment;
+  }
+  String lockState = stateModelDef.getStatesPriorityList().get(0);
+
+  // Count the number of participants allowed to lock each lock
+  String stateCount = stateModelDef.getNumInstancesPerState(lockState);
+  int lockHolders = 0;
+  try {
+    // a numeric value is a custom-specified number of participants allowed to lock the lock
+    lockHolders = Integer.parseInt(stateCount);
+  } catch (NumberFormatException e) {
+    LOG.error("Invalid state model definition. The lock state does not have a valid count");
+    return assignment;
+  }
+
+  // Fairly assign the lock state to the participants using a simple mod-based sequential
+  // assignment. For instance, if each lock can be held by 3 participants, lock 0 would be held
+  // by participants (0, 1, 2), lock 1 would be held by (1, 2, 3), and so on, wrapping around the
+  // number of participants as necessary.
+  // This assumes a simple lock-unlock model where the only state of interest is which nodes have
+  // acquired each lock.
+  int i = 0;
+  for (Partition partition : resource.getPartitions()) {
+    Map<String, String> replicaMap = new HashMap<String, String>();
+    for (int j = i; j < i + lockHolders; j++) {
+      int participantIndex = j % liveParticipants.size();
+      String participant = liveParticipants.get(participantIndex);
+      // enforce that a participant can only have one instance of a given lock
+      if (!replicaMap.containsKey(participant)) {
+        replicaMap.put(participant, lockState);
+      }
+    }
+    assignment.addReplicaMap(partition, replicaMap);
+    i++;
+  }
+  return assignment;
+}
+```
+
+Here is the ResourceMapping emitted by the user-defined rebalancer for a 3-participant system whenever there is a change to the set of participants.
+
+* Participant_A joins
+
+```
+{
+  "lock_0": { "Participant_A": "LOCKED"},
+  "lock_1": { "Participant_A": "LOCKED"},
+  "lock_2": { "Participant_A": "LOCKED"},
+  "lock_3": { "Participant_A": "LOCKED"},
+  "lock_4": { "Participant_A": "LOCKED"},
+  "lock_5": { "Participant_A": "LOCKED"},
+}
+```
+
+A ResourceMapping is a mapping for each resource of partition to the participant serving each replica and the state of each replica. The state model is a simple LOCKED/RELEASED model, so participant A holds all lock partitions in the LOCKED state.
+
+* Participant_B joins
+
+```
+{
+  "lock_0": { "Participant_A": "LOCKED"},
+  "lock_1": { "Participant_B": "LOCKED"},
+  "lock_2": { "Participant_A": "LOCKED"},
+  "lock_3": { "Participant_B": "LOCKED"},
+  "lock_4": { "Participant_A": "LOCKED"},
+  "lock_5": { "Participant_B": "LOCKED"},
+}
+```
+
+Now that there are two participants, the simple mod-based function assigns every other lock to the second participant. On any system change, the rebalancer is invoked so that the application can define how to redistribute its resources.
+
+* Participant_C joins (steady state)
+
+```
+{
+  "lock_0": { "Participant_A": "LOCKED"},
+  "lock_1": { "Participant_B": "LOCKED"},
+  "lock_2": { "Participant_C": "LOCKED"},
+  "lock_3": { "Participant_A": "LOCKED"},
+  "lock_4": { "Participant_B": "LOCKED"},
+  "lock_5": { "Participant_C": "LOCKED"},
+}
+```
+
+This is the steady state of the system. Notice that four of the six locks now have a different owner. That is because of the naïve modulus-based assignmemt approach used by the user-defined rebalancer. However, the interface is flexible enough to allow you to employ consistent hashing or any other scheme if minimal movement is a system requirement.
+
+* Participant_B fails
+
+```
+{
+  "lock_0": { "Participant_A": "LOCKED"},
+  "lock_1": { "Participant_C": "LOCKED"},
+  "lock_2": { "Participant_A": "LOCKED"},
+  "lock_3": { "Participant_C": "LOCKED"},
+  "lock_4": { "Participant_A": "LOCKED"},
+  "lock_5": { "Participant_C": "LOCKED"},
+}
+```
+
+On any node failure, as in the case of node addition, the rebalancer is invoked automatically so that it can generate a new mapping as a response to the change. Helix ensures that the Rebalancer has the opportunity to reassign locks as required by the application.
+
+* Participant_B (or the replacement for the original Participant_B) rejoins
+
+```
+{
+  "lock_0": { "Participant_A": "LOCKED"},
+  "lock_1": { "Participant_B": "LOCKED"},
+  "lock_2": { "Participant_C": "LOCKED"},
+  "lock_3": { "Participant_A": "LOCKED"},
+  "lock_4": { "Participant_B": "LOCKED"},
+  "lock_5": { "Participant_C": "LOCKED"},
+}
+```
+
+The rebalancer was invoked once again and the resulting ResourceMapping reflects the steady state.
+
+### Caveats
+- The rebalancer class must be available at runtime, or else Helix will not attempt to rebalance at all
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/incubator-helix/blob/c5742650/src/site/markdown/tutorial_yaml.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/tutorial_yaml.md b/src/site/markdown/tutorial_yaml.md
new file mode 100644
index 0000000..1524c9d
--- /dev/null
+++ b/src/site/markdown/tutorial_yaml.md
@@ -0,0 +1,98 @@
+<!---
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Helix Tutorial: YAML Cluster Setup
+
+As an alternative to using Helix Admin to set up the cluster, its resources, constraints, and the state model, Helix supports bootstrapping a cluster configuration based on a YAML file. Below is an annotated example of such a file for a simple distributed lock manager where a lock can only be LOCKED or RELEASED, and each lock only allows a single participant to hold it in the LOCKED state.
+
+```
+clusterName: lock-manager-custom-rebalancer # unique name for the cluster (required)
+resources:
+  - name: lock-group # unique resource name (required)
+    rebalancer: # required
+      mode: USER_DEFINED # required - USER_DEFINED means we will provide our own rebalancer
+      class: org.apache.helix.userdefinedrebalancer.LockManagerRebalancer # required for USER_DEFINED
+    partitions:
+      count: 12 # number of partitions for the resource (default is 1)
+      replicas: 1 # number of replicas per partition (default is 1)
+    stateModel:
+      name: lock-unlock # model name (required)
+      states: [LOCKED, RELEASED, DROPPED] # the list of possible states (required if model not built-in)
+      transitions: # the list of possible transitions (required if model not built-in)
+        - name: Unlock
+          from: LOCKED
+          to: RELEASED
+        - name: Lock
+          from: RELEASED
+          to: LOCKED
+        - name: DropLock
+          from: LOCKED
+          to: DROPPED
+        - name: DropUnlock
+          from: RELEASED
+          to: DROPPED
+        - name: Undrop
+          from: DROPPED
+          to: RELEASED
+      initialState: RELEASED # (required if model not built-in)
+    constraints:
+      state:
+        counts: # maximum number of replicas of a partition that can be in each state (required if model not built-in)
+          - name: LOCKED
+            count: "1"
+          - name: RELEASED
+            count: "-1"
+          - name: DROPPED
+            count: "-1"
+        priorityList: [LOCKED, RELEASED, DROPPED] # states in order of priority (all priorities equal if not specified)
+      transition: # transitions priority to enforce order that transitions occur
+        priorityList: [Unlock, Lock, Undrop, DropUnlock, DropLock] # all priorities equal if not specified
+participants: # list of nodes that can serve replicas (optional if dynamic joining is active, required otherwise)
+  - name: localhost_12001
+    host: localhost
+    port: 12001
+  - name: localhost_12002
+    host: localhost
+    port: 12002
+  - name: localhost_12003
+    host: localhost
+    port: 12003
+```
+
+Using a file like the one above, the cluster can be set up either with the command line:
+
+```
+incubator-helix/helix-core/target/helix-core/pkg/bin/YAMLClusterSetup.sh localhost:2199 lock-manager-config.yaml
+```
+
+or with code:
+
+```
+YAMLClusterSetup setup = new YAMLClusterSetup(zkAddress);
+InputStream input =
+    Thread.currentThread().getContextClassLoader()
+        .getResourceAsStream("lock-manager-config.yaml");
+YAMLClusterSetup.YAMLClusterConfig config = setup.setupCluster(input);
+```
+
+Some notes:
+
+- A rebalancer class is only required for the USER_DEFINED mode. It is ignored otherwise.
+
+- Built-in state models, like OnlineOffline, LeaderStandby, and MasterSlave, or state models that have already been added only require a name for stateModel. If partition and/or replica counts are not provided, a value of 1 is assumed.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/incubator-helix/blob/c5742650/src/site/site.xml
----------------------------------------------------------------------
diff --git a/src/site/site.xml b/src/site/site.xml
index e44a43e..e9a5cce 100644
--- a/src/site/site.xml
+++ b/src/site/site.xml
@@ -72,6 +72,7 @@
       <item name="Rsync replicated file store" href="./recipes/rsync_replicated_file_store.html"/>
       <item name="Service Discovery" href="./recipes/service_discovery.html"/>
       <item name="Distributed task DAG Execution" href="./recipes/task_dag_execution.html"/>
+      <item name="User-Defined Rebalancer Example" href="./recipes/user_def_rebalancer.html"/>
     </menu>
 
     <menu name="Get Involved">