You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@helix.apache.org by kishore g <g....@gmail.com> on 2013/02/26 03:31:51 UTC
Re: Questions about Helix

Thanks Abhishek. Glad you are enjoying playing with Helix. Apologize for
the insufficient documentation, we have additional documentation that need
some clean up like converting to markdown format and removing linked in
specific stuff. Will be great if you can help us here.

The reason we read everything from zookeeper is to have a consistent
snapshot of the system state. We have lot of optimization to read only
changed data and use zk async apis.

In general it is not a good idea to keep  any state in memory state in the
controller, it makes it very difficult to reason about issues and also
provide fault tolerant system.

You can add your code in the best possible calc stage, and depend on the
data in cluster data cache. Do not worry about the existing messages or
current state. We have other stages downstream message selection that makes
sure constraints are not violated.

Basically the trick is to make the idealstate calc code idempotent. That is
given a state machine, constraints,objectives and set of live nodes come up
with the same idealstate. If you can model your algo this way, you will be
good.

Your understanding about distributed controllers is right. I will provide
more details on the website on running it in distributed mode. But you
probably dont need this since you have only one cluster. You can simple
start multiple controllers and we ensure that only one will be active.

We do have a mechanism to test, simulate failure and analyze logs.
Unfortunately it uses some linkedin internal tool to validate the logs for
any constraint violation. I will create a jira and post the idea and
implementation we have. You can help us take it to the next level.

If you get your algo to be idempotent then customcodeinvoker might work for
you.

Thanks again for brilliant questions.

 On Feb 25, 2013 12:04 PM, "Abhishek Rai" <ab...@gmail.com> wrote:

> Hi Helix experts,
>
> For the past few weeks I've been playing with Helix and would like to share
> some experiences, and ask some questions.
>
> First of all, thanks to the Helix team for creating and open sourcing such
> an awesome tool!  I like the abstractions used by Helix and found the SOCC
> paper very helpful.
>
> My use case currently is for managing a cluster containing a single DDS.
> In the future, we will need to manage about 5-6 different DDS' within the
> same cluster of machines.  The DDS I'm managing needs customized
> rebalancing.  I've setup a participant on each machine in the cluster, and
> a centralized controller manages the cluster.
>
> I am not sure what is the best way to integrate my rebalancing code with
> Helix controller code.  Kishore previously suggested adding a new stage to
> the controller's pipeline.  An alternative that I've implemented is to
> subclass from GenericHelixController, and in each listener callback, run my
> rebalancing code and write out ideal states using ZKHelixAdmin.  The
> callbacks maintain an in-memory model of cluster state and do not read it
> from the Zookeeper as part of the custom rebalancing functionality.  In
> contrast, the pipeline stages used by GenericHelixController seem to read
> the data directly from Zookeeper every time.  The pipeline stages are also
> aware of ongoing transitions, which my rebalancer code is not aware of.
> What is the recommended approach for adding custom rebalancing code?
>
> For high availability, I run 3 controllers on different nodes with custom
> leader-election between them.  When a controller starts, it waits to grab a
> Zookeeper lock, and then connects as a Helix controller.  Controller which
> loses its lock dies and is restarted automatically by the shell.  I tried
> using the "distributed controller" feature in Helix but couldn't.  I kept
> seeing in the controller logs "initial cluster setup is not done...".  I
> tried a few things based on reading the Helix paper and docs (e.g. setting
> up another cluster and adding each controller as a participant to that
> cluster) but couldn't figure out how to make it work.  I realize that I
> don't understand how the distributed controller feature works.  Is the idea
> that each controller is a participant in another Helix cluster, and
> receives controller-ship of a DDS cluster as a "resource assignment"?  In
> that case, is a "super" controller needed for this "super" cluster?  If so,
> then how does one ensure HA of the super cluster?
>
> I've been stress testing the system in production by repeatedly restarting
> controller and participant nodes.  All this while ensuring that Zookeeper
> stays up.  I have run into some problems.  Kishore helped triage one of
> them last week (https://issues.apache.org/jira/browse/HELIX-53).  This
> problem was manifesting itself as messages of the following form in
> participant and controller logs:
> ERROR org.apache.helix.controller.stages.MessageGenerationPhase: Unable to
> find a next state for partition XYZ  from:SERVING to:OFFLINE
> and also
> ERROR ...  Force CurrentState on Zk to be stateModel's CurrentState.
> I'm still getting some of these messages but I can tell that the system is
> working fine overall now.
>
> What are the semantics of the persistent message queue between the
> controller and the participant.  If the controller restarts or fails over
> while there are outstanding messages for existing participants, does the
> new controller honor the transitions implied by any outstanding messages?
> How does the participant acknowledge that it has executed the transition
> specified in a message?  Does it do so by writing a new current state to
> Zookeeper, or by deleting the old message?
>
> Also, is there any testing framework distributed with Helix for integration
> testing of a customized Helix controller and participants.  For example,
> something that would take care of scaffolding of the cluster, provide hooks
> for simulating operational problems such as participant failures.
>
> Thanks for your help!
> Abhishek
>