You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@helix.apache.org by Abhishek Rai <ab...@gmail.com> on 2013/02/02 02:44:16 UTC

Question about participant dropping a subset of assigned partitions

Hi Helix users!

I'm a Helix newbie and need some advice about a use case.  I'm using Helix
to manage a storage system which fits the description of a DDS
("distributed data service" as defined in the Helix SOCC paper).  Each
participant hosts a bunch of partitions of a resource, as assigned by the
controller.  The set of partitions assigned to a participant changes
dynamically as the controller rebalances partitions, nodes join or leave,
etc.

Additionally, I need the ability for a participant to "drop" a subset of
partitions currently assigned to it.  When a partition is dropped by a
participant, Helix would remove the partition from the current state of the
instance, update the external view, and make the partition available for
rebalancing by the controller.  Does the Java API provide a way of
accomplishing this?  If not, are there any workarounds?  Or, was there a
design rationale to disallow such actions from the participant?

Thanks,
Abhishek

Re: Question about participant dropping a subset of assigned partitions

Posted by kishore g <g....@gmail.com>.

Hi Abhishek,

Regarding the standalone agent, Santiago has started this thread
http://helix-dev.markmail.org/message/5h2fogbigexnhb4s. I think it supports
recursion where one agent can manage another cluster. This is still in
design phase and there are multiple use cases that need a similar solution.
Feel free to contribute to the design/implementation on this JIRA
https://issues.apache.org/jira/browse/HELIX-45.

If you are setting only the idealstate, I suggest you use the
customcodeinvoker. You can registers for various changes like( nodes
starting/stopping) etc. Take a look at this.
https://git-wip-us.apache.org/repos/asf?p=incubator-helix.git;a=blob;f=helix-core/src/test/java/org/apache/helix/integration/TestHelixCustomCodeRunner.java;h=9bf79b8b34c14b7ce1e3fc45a45ceb19fdac4874;hb=437eb42e

This has the advantage that you can still benefit from features like
ensuring constraints and throttling that comes with default controller. And
as i mentioned earlier, this will allow you to upgrade to a newer Helix
version with out any issues.

thanks,
Kishore G




On Thu, Feb 7, 2013 at 10:02 AM, Abhishek Rai <ab...@gmail.com> wrote:

> Thanks again for the thoughtful feedback Kishore.
>
> On Sat, Feb 2, 2013 at 12:20 AM, kishore g <g....@gmail.com> wrote:
>
>> Thanks Abhishek, you are thinking in the right direction.
>>
>> Your point about disable-enable happening too quickly is valid. However,
>> how fast can you detect the c++ process crash and restart. Is C++ process
>> running in a daemon mode and it is restarted automatically or the
>> helix-java-proxy agent will be responsible to start the c++ process.
>>
>
> Both options are possible at this time, but I was evaluating the
> disable-partition suggestion for soundness.  However, I agree that it's a
> highly unlikely scenario.
>
>
>> The messaging solution will work but the alternative of modeling each c++
>> process as a participant makes sense and is the right thing to do.
>>
>> Can you provide more details on "its ok if the java-proxy dies" why does
>> it not effect the correctness? when the agent is restarted does it
>> re-register the c++ instances, do you plan to store the c++ pid in Helix so
>> that the agent can remember the processes it had started?
>>
>
> Death or restarts of the java proxy may result in temporarily
> unavailability of some data until the controller rebalances the lost
> partitions.  Death or restarts of the C++ DDS process also affects
> availability but if the Java proxy stays up, then (1) controller may not
> notice the unavailability, and (2) DDS clients may continue to think that
> their data is reachable when it's not.  Sorry I misused the term
> "correctness" for describing the weaker availability in the latter case.
>
>
>>
>> The reason I am asking these questions is there is a similar effort on
>> writing a stand alone helix agent that acts as proxies for other processes
>> started on the node. In general this approach seems to be quite useful and
>> might have some common functionality that can be leverage across multiple
>> implementations.
>>
>
> That's great!  The proxy agent will be very useful.  I wonder if a good
> goal would be to enable recursion in the proxy such that the participant
> itself can be a controller for another Helix cluster.  Thus the
> proxy-participant could delegate its set of assigned resources to another
> set of participants.  This may be trivially true.
>
>
>> As for the writing the custom rebalancer, you have two options 1) as you
>> mentioned you can write inside the controller 2) there is another feature
>> called CustomCodeInvoker. You can basically write your logic to change the
>> idealstate in this and simply run it along with your java proxy and helix
>> will ensure it is actively running on only one node. This has an overhead
>> of around 50-100ms on reacting to failure but is much cleaner. If you are
>> doing 1) you need to be careful to not change existing code but simply add
>> a new stage in the pipeline. That way you will be able to upgrade Helix to
>> get new features without breaking your functionality.
>>
>
> Thanks for the suggestions.  I'm doing 1) but in a slightly different way,
> please let me know if I'm totally off in the wrong direction :-)  Or if I'm
> likely to run into problems with future Helix upgrades.
>
> I've subclassed GenericHelixController and implemented all listener
> callbacks.  This subclass registers for all events and ensures that
> GenericHelixController listeners run for each event.  Internally, it
> implements the scheduling logic that it needs and applies it via
> ZKHelixAdmin.setResourceIdealState().  Do you see any clear benefits of
> changing this to insert a new stage in GenericHelixController's pipeline?
>
> I'm following a similar scheme in the custom participant except that it
> directly registers the listeners with Helix without using a
> GenericHelixController.  I will take a closer look at CustomCodeInvoker,
> looks very useful.
>
>
>>
>> On the topic of "Helix does not have a c++ library", do you think it
>> would make it easy if there was a c++ library?. This may not that hard to
>> write because only thing that needs to be written in c++ is participant
>> code which simply acts on the message from controller. Majority of the code
>> is in controller and it can still be run as java. We are working on a
>> python agent and i hope some one will write a c++ agent.
>>
>
> Thanks for the update, yeah I am thinking of taking a stab at it in 1-2
> months if it still seems useful.
>
>
>>
>> One of the good things of modeling each c++ process as an instance is
>> that in future if there is a c++ helix agent then you can easily migrate to
>> it.
>>
>
> Cool.
> Thanks again!
> Abhishek
>
>
>>
>> Hope this helps.
>>
>> thanks,
>> Kishore G
>>
>>
>>
>> On Fri, Feb 1, 2013 at 9:44 PM, Terence Yim <ch...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> What do mean by "fake" live instance that you mentioned? I think the
>>> Java proxy could simply creates one HelixManager participant per C++
>>> instance (hence there are N HelixManager instances in the Java proxy) and
>>> disconnect them accordingly based on the liveness of the C++ process.
>>>
>>> Terence
>>>
>>> On Fri, Feb 1, 2013 at 7:19 PM, Abhishek Rai <ab...@gmail.com>wrote:
>>>
>>>> Thanks for the quick and thoughtful response Kishore!  Comments inline.
>>>>
>>>>
>>>> On Fri, Feb 1, 2013 at 6:33 PM, kishore g <g....@gmail.com> wrote:
>>>>
>>>>> Hi Abhishek,
>>>>>
>>>>> Thanks for the good question. We have two options(listed later in the
>>>>> email) for allowing a partition to drop the partitions.  However, It works
>>>>> only in two modes (auto, custom) of three mode(auto_rebalance, auto,
>>>>> custom) the controller can support.
>>>>> More info about modes here
>>>>> http://helix.incubator.apache.org/Features.html
>>>>>
>>>>> Can you let me know which mode you are running it in?.
>>>>>
>>>>
>>>> We are planning to use CUSTOM mode since we have some specific
>>>> requirements about (1) the desired state for each partition, and (2)
>>>> scheduling partitions to instances.  Our requirements for (1) are not
>>>> expressible in the FSM framework.
>>>>
>>>> Also is it sufficient if the disabled partitions are re-assigned
>>>>> uniformly to other nodes or you want to other partitions from other nodes
>>>>> to be assigned to this node.
>>>>>
>>>>
>>>> Once a participant disables some partitions, it's alright for the
>>>> default rebalancing logic to kick in.
>>>>
>>>>
>>>>>
>>>>> Also it will help us if you can tell the use case when you need this
>>>>> feature.
>>>>>
>>>>
>>>> Sure, I'm still trying to hash things out but here is a summary.  The
>>>> DDS nodes are C++ processes, which is the crux of the problem.  AFAIK Helix
>>>> does not have a C++ library, so I'm planning to use a participant written
>>>> in Java, which runs as a separate process on the same node, receives state
>>>> transitions from the controller, and proxies them to the C++ process via an
>>>> RPC interface.  The problem is that the C++ process and the Java-Helix
>>>> proxy can fail independently.  I'm not worried about the Java-Helix proxy
>>>> crashing since that would knock of all partitions in the C++ process from
>>>> Helix view, which does not affect correctness.
>>>>
>>>> But when the C++ process crashes, the Java-Helix proxy needs to let the
>>>> controller know asap, so the Helix "external view" can be updated,
>>>> rebalancing can start, etc.  One alternative is to invoke
>>>> "manager.disconnect()" from the Helix proxy.  But this would knock off all
>>>> partitions managed by the proxy (I want to retain the ability for the proxy
>>>> to manage multiple C++ programs).  Hence the question about selectively
>>>> dropping certain partitions, viz., the ones in a crashed C++ program.
>>>>
>>>>
>>>>> To summarize, you can achieve this in AUTO and CUSTOM but not in
>>>>> AUTO_REBALANCE mode because the goal of controller is always to assign the
>>>>> partitions evenly among nodes. But you bring up a good use case, depending
>>>>> the  behavior we might be able to support it easily.
>>>>>
>>>>> 1. Disable a partition on a given node: Disabling a partition on a
>>>>> particular node should automatically trigger rebalancing. This can be done
>>>>> either by admin using command line tool
>>>>> helix-admin.sh --zkSvr <ZookeeperServerAddress(Required)>
>>>>> --enablePartition <clusterName instanceName resourceName partitionName
>>>>> true/false>
>>>>>
>>>>> or programmatically if you have the access to manager, you can invoke
>>>>> this
>>>>>
>>>>> manager.getClusterManagementTool().enablePartition(enabled,
>>>>>
>>>>> clusterName,instanceName,resourceName,partitionNames);
>>>>>
>>>>> This can be done in auto and custom.
>>>>>
>>>> I am not sure this will have the right effect in the scenario described
>>>> above.  Specifically, the Java proxy would need to disable all the crashed
>>>> partitions, and then re-enable them when the C++ DDS process reboots
>>>> successfully.  If the disable-enable transitions happen too quickly, could
>>>> the controller possibly miss the transition for some partition and not do
>>>> anything?
>>>>
>>>>> 2. The other option is to change the mapping of partition --> node in
>>>>> the ideal state. ( You can do this programmatically in custom modes and in
>>>>> some what in auto mode as well). Doing this will send transitions to the
>>>>> node to drop the partitions and reassign it to other nodes.
>>>>>
>>>>
>>>> Yes, this seems like the most logical thing.  The Java proxy will
>>>> probably need to send a message to the controller to trigger this change in
>>>> the ideal states of all crashed partitions.  The messaging API would
>>>> probably be useful here.
>>>>
>>>> Another alternative I'm considering is for the Java proxy to add a
>>>> "fake" instance for each C++ process that it spawns locally.  The custom
>>>> rebalancer (that I'd write inside the controller) would then schedule the
>>>> C++ DDS partitions on to these "fake" live instances.  When the C++ process
>>>> crashes, the Java proxy would simply disconnect the corresponding fake
>>>> instance's manager.  Does this make sense to you?  Or do you have any other
>>>> thoughts?
>>>>
>>>> Thanks again for your thoughtful feedback!
>>>> Abhishek
>>>>
>>>>
>>>>>
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Kishore G
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Feb 1, 2013 at 5:44 PM, Abhishek Rai <ab...@gmail.com>wrote:
>>>>>
>>>>>> Hi Helix users!
>>>>>>
>>>>>> I'm a Helix newbie and need some advice about a use case.  I'm using
>>>>>> Helix to manage a storage system which fits the description of a DDS
>>>>>> ("distributed data service" as defined in the Helix SOCC paper).  Each
>>>>>> participant hosts a bunch of partitions of a resource, as assigned by the
>>>>>> controller.  The set of partitions assigned to a participant changes
>>>>>> dynamically as the controller rebalances partitions, nodes join or leave,
>>>>>> etc.
>>>>>>
>>>>>> Additionally, I need the ability for a participant to "drop" a subset
>>>>>> of partitions currently assigned to it.  When a partition is dropped by a
>>>>>> participant, Helix would remove the partition from the current state of the
>>>>>> instance, update the external view, and make the partition available for
>>>>>> rebalancing by the controller.  Does the Java API provide a way of
>>>>>> accomplishing this?  If not, are there any workarounds?  Or, was there a
>>>>>> design rationale to disallow such actions from the participant?
>>>>>>
>>>>>> Thanks,
>>>>>> Abhishek
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Question about participant dropping a subset of assigned partitions

Posted by Abhishek Rai <ab...@gmail.com>.

Thanks again for the thoughtful feedback Kishore.

On Sat, Feb 2, 2013 at 12:20 AM, kishore g <g....@gmail.com> wrote:

> Thanks Abhishek, you are thinking in the right direction.
>
> Your point about disable-enable happening too quickly is valid. However,
> how fast can you detect the c++ process crash and restart. Is C++ process
> running in a daemon mode and it is restarted automatically or the
> helix-java-proxy agent will be responsible to start the c++ process.
>

Both options are possible at this time, but I was evaluating the
disable-partition suggestion for soundness.  However, I agree that it's a
highly unlikely scenario.


> The messaging solution will work but the alternative of modeling each c++
> process as a participant makes sense and is the right thing to do.
>
> Can you provide more details on "its ok if the java-proxy dies" why does
> it not effect the correctness? when the agent is restarted does it
> re-register the c++ instances, do you plan to store the c++ pid in Helix so
> that the agent can remember the processes it had started?
>

Death or restarts of the java proxy may result in temporarily
unavailability of some data until the controller rebalances the lost
partitions.  Death or restarts of the C++ DDS process also affects
availability but if the Java proxy stays up, then (1) controller may not
notice the unavailability, and (2) DDS clients may continue to think that
their data is reachable when it's not.  Sorry I misused the term
"correctness" for describing the weaker availability in the latter case.


>
> The reason I am asking these questions is there is a similar effort on
> writing a stand alone helix agent that acts as proxies for other processes
> started on the node. In general this approach seems to be quite useful and
> might have some common functionality that can be leverage across multiple
> implementations.
>

That's great!  The proxy agent will be very useful.  I wonder if a good
goal would be to enable recursion in the proxy such that the participant
itself can be a controller for another Helix cluster.  Thus the
proxy-participant could delegate its set of assigned resources to another
set of participants.  This may be trivially true.


> As for the writing the custom rebalancer, you have two options 1) as you
> mentioned you can write inside the controller 2) there is another feature
> called CustomCodeInvoker. You can basically write your logic to change the
> idealstate in this and simply run it along with your java proxy and helix
> will ensure it is actively running on only one node. This has an overhead
> of around 50-100ms on reacting to failure but is much cleaner. If you are
> doing 1) you need to be careful to not change existing code but simply add
> a new stage in the pipeline. That way you will be able to upgrade Helix to
> get new features without breaking your functionality.
>

Thanks for the suggestions.  I'm doing 1) but in a slightly different way,
please let me know if I'm totally off in the wrong direction :-)  Or if I'm
likely to run into problems with future Helix upgrades.

I've subclassed GenericHelixController and implemented all listener
callbacks.  This subclass registers for all events and ensures that
GenericHelixController listeners run for each event.  Internally, it
implements the scheduling logic that it needs and applies it via
ZKHelixAdmin.setResourceIdealState().  Do you see any clear benefits of
changing this to insert a new stage in GenericHelixController's pipeline?

I'm following a similar scheme in the custom participant except that it
directly registers the listeners with Helix without using a
GenericHelixController.  I will take a closer look at CustomCodeInvoker,
looks very useful.


>
> On the topic of "Helix does not have a c++ library", do you think it would
> make it easy if there was a c++ library?. This may not that hard to write
> because only thing that needs to be written in c++ is participant code
> which simply acts on the message from controller. Majority of the code is
> in controller and it can still be run as java. We are working on a python
> agent and i hope some one will write a c++ agent.
>

Thanks for the update, yeah I am thinking of taking a stab at it in 1-2
months if it still seems useful.


>
> One of the good things of modeling each c++ process as an instance is that
> in future if there is a c++ helix agent then you can easily migrate to it.
>

Cool.
Thanks again!
Abhishek


>
> Hope this helps.
>
> thanks,
> Kishore G
>
>
>
> On Fri, Feb 1, 2013 at 9:44 PM, Terence Yim <ch...@gmail.com> wrote:
>
>> Hi,
>>
>> What do mean by "fake" live instance that you mentioned? I think the Java
>> proxy could simply creates one HelixManager participant per C++ instance
>> (hence there are N HelixManager instances in the Java proxy) and disconnect
>> them accordingly based on the liveness of the C++ process.
>>
>> Terence
>>
>> On Fri, Feb 1, 2013 at 7:19 PM, Abhishek Rai <ab...@gmail.com>wrote:
>>
>>> Thanks for the quick and thoughtful response Kishore!  Comments inline.
>>>
>>>
>>> On Fri, Feb 1, 2013 at 6:33 PM, kishore g <g....@gmail.com> wrote:
>>>
>>>> Hi Abhishek,
>>>>
>>>> Thanks for the good question. We have two options(listed later in the
>>>> email) for allowing a partition to drop the partitions.  However, It works
>>>> only in two modes (auto, custom) of three mode(auto_rebalance, auto,
>>>> custom) the controller can support.
>>>> More info about modes here
>>>> http://helix.incubator.apache.org/Features.html
>>>>
>>>> Can you let me know which mode you are running it in?.
>>>>
>>>
>>> We are planning to use CUSTOM mode since we have some specific
>>> requirements about (1) the desired state for each partition, and (2)
>>> scheduling partitions to instances.  Our requirements for (1) are not
>>> expressible in the FSM framework.
>>>
>>> Also is it sufficient if the disabled partitions are re-assigned
>>>> uniformly to other nodes or you want to other partitions from other nodes
>>>> to be assigned to this node.
>>>>
>>>
>>> Once a participant disables some partitions, it's alright for the
>>> default rebalancing logic to kick in.
>>>
>>>
>>>>
>>>> Also it will help us if you can tell the use case when you need this
>>>> feature.
>>>>
>>>
>>> Sure, I'm still trying to hash things out but here is a summary.  The
>>> DDS nodes are C++ processes, which is the crux of the problem.  AFAIK Helix
>>> does not have a C++ library, so I'm planning to use a participant written
>>> in Java, which runs as a separate process on the same node, receives state
>>> transitions from the controller, and proxies them to the C++ process via an
>>> RPC interface.  The problem is that the C++ process and the Java-Helix
>>> proxy can fail independently.  I'm not worried about the Java-Helix proxy
>>> crashing since that would knock of all partitions in the C++ process from
>>> Helix view, which does not affect correctness.
>>>
>>> But when the C++ process crashes, the Java-Helix proxy needs to let the
>>> controller know asap, so the Helix "external view" can be updated,
>>> rebalancing can start, etc.  One alternative is to invoke
>>> "manager.disconnect()" from the Helix proxy.  But this would knock off all
>>> partitions managed by the proxy (I want to retain the ability for the proxy
>>> to manage multiple C++ programs).  Hence the question about selectively
>>> dropping certain partitions, viz., the ones in a crashed C++ program.
>>>
>>>
>>>> To summarize, you can achieve this in AUTO and CUSTOM but not in
>>>> AUTO_REBALANCE mode because the goal of controller is always to assign the
>>>> partitions evenly among nodes. But you bring up a good use case, depending
>>>> the  behavior we might be able to support it easily.
>>>>
>>>> 1. Disable a partition on a given node: Disabling a partition on a
>>>> particular node should automatically trigger rebalancing. This can be done
>>>> either by admin using command line tool
>>>> helix-admin.sh --zkSvr <ZookeeperServerAddress(Required)>
>>>> --enablePartition <clusterName instanceName resourceName partitionName
>>>> true/false>
>>>>
>>>> or programmatically if you have the access to manager, you can invoke
>>>> this
>>>>
>>>> manager.getClusterManagementTool().enablePartition(enabled,
>>>>
>>>> clusterName,instanceName,resourceName,partitionNames);
>>>>
>>>> This can be done in auto and custom.
>>>>
>>> I am not sure this will have the right effect in the scenario described
>>> above.  Specifically, the Java proxy would need to disable all the crashed
>>> partitions, and then re-enable them when the C++ DDS process reboots
>>> successfully.  If the disable-enable transitions happen too quickly, could
>>> the controller possibly miss the transition for some partition and not do
>>> anything?
>>>
>>>> 2. The other option is to change the mapping of partition --> node in
>>>> the ideal state. ( You can do this programmatically in custom modes and in
>>>> some what in auto mode as well). Doing this will send transitions to the
>>>> node to drop the partitions and reassign it to other nodes.
>>>>
>>>
>>> Yes, this seems like the most logical thing.  The Java proxy will
>>> probably need to send a message to the controller to trigger this change in
>>> the ideal states of all crashed partitions.  The messaging API would
>>> probably be useful here.
>>>
>>> Another alternative I'm considering is for the Java proxy to add a
>>> "fake" instance for each C++ process that it spawns locally.  The custom
>>> rebalancer (that I'd write inside the controller) would then schedule the
>>> C++ DDS partitions on to these "fake" live instances.  When the C++ process
>>> crashes, the Java proxy would simply disconnect the corresponding fake
>>> instance's manager.  Does this make sense to you?  Or do you have any other
>>> thoughts?
>>>
>>> Thanks again for your thoughtful feedback!
>>> Abhishek
>>>
>>>
>>>>
>>>>
>>>>
>>>> Thanks,
>>>> Kishore G
>>>>
>>>>
>>>>
>>>> On Fri, Feb 1, 2013 at 5:44 PM, Abhishek Rai <ab...@gmail.com>wrote:
>>>>
>>>>> Hi Helix users!
>>>>>
>>>>> I'm a Helix newbie and need some advice about a use case.  I'm using
>>>>> Helix to manage a storage system which fits the description of a DDS
>>>>> ("distributed data service" as defined in the Helix SOCC paper).  Each
>>>>> participant hosts a bunch of partitions of a resource, as assigned by the
>>>>> controller.  The set of partitions assigned to a participant changes
>>>>> dynamically as the controller rebalances partitions, nodes join or leave,
>>>>> etc.
>>>>>
>>>>> Additionally, I need the ability for a participant to "drop" a subset
>>>>> of partitions currently assigned to it.  When a partition is dropped by a
>>>>> participant, Helix would remove the partition from the current state of the
>>>>> instance, update the external view, and make the partition available for
>>>>> rebalancing by the controller.  Does the Java API provide a way of
>>>>> accomplishing this?  If not, are there any workarounds?  Or, was there a
>>>>> design rationale to disallow such actions from the participant?
>>>>>
>>>>> Thanks,
>>>>> Abhishek
>>>>>
>>>>
>>>>
>>>
>>
>

Re: Question about participant dropping a subset of assigned partitions

Posted by kishore g <g....@gmail.com>.

Thanks Abhishek, you are thinking in the right direction.

Your point about disable-enable happening too quickly is valid. However,
how fast can you detect the c++ process crash and restart. Is C++ process
running in a daemon mode and it is restarted automatically or the
helix-java-proxy agent will be responsible to start the c++ process.

The messaging solution will work but the alternative of modeling each c++
process as a participant makes sense and is the right thing to do.

Can you provide more details on "its ok if the java-proxy dies" why does it
not effect the correctness? when the agent is restarted does it re-register
the c++ instances, do you plan to store the c++ pid in Helix so that the
agent can remember the processes it had started?

The reason I am asking these questions is there is a similar effort on
writing a stand alone helix agent that acts as proxies for other processes
started on the node. In general this approach seems to be quite useful and
might have some common functionality that can be leverage across multiple
implementations.

As for the writing the custom rebalancer, you have two options 1) as you
mentioned you can write inside the controller 2) there is another feature
called CustomCodeInvoker. You can basically write your logic to change the
idealstate in this and simply run it along with your java proxy and helix
will ensure it is actively running on only one node. This has an overhead
of around 50-100ms on reacting to failure but is much cleaner. If you are
doing 1) you need to be careful to not change existing code but simply add
a new stage in the pipeline. That way you will be able to upgrade Helix to
get new features without breaking your functionality.

On the topic of "Helix does not have a c++ library", do you think it would
make it easy if there was a c++ library?. This may not that hard to write
because only thing that needs to be written in c++ is participant code
which simply acts on the message from controller. Majority of the code is
in controller and it can still be run as java. We are working on a python
agent and i hope some one will write a c++ agent.

One of the good things of modeling each c++ process as an instance is that
in future if there is a c++ helix agent then you can easily migrate to it.

Hope this helps.

thanks,
Kishore G



On Fri, Feb 1, 2013 at 9:44 PM, Terence Yim <ch...@gmail.com> wrote:

> Hi,
>
> What do mean by "fake" live instance that you mentioned? I think the Java
> proxy could simply creates one HelixManager participant per C++ instance
> (hence there are N HelixManager instances in the Java proxy) and disconnect
> them accordingly based on the liveness of the C++ process.
>
> Terence
>
> On Fri, Feb 1, 2013 at 7:19 PM, Abhishek Rai <ab...@gmail.com>wrote:
>
>> Thanks for the quick and thoughtful response Kishore!  Comments inline.
>>
>>
>> On Fri, Feb 1, 2013 at 6:33 PM, kishore g <g....@gmail.com> wrote:
>>
>>> Hi Abhishek,
>>>
>>> Thanks for the good question. We have two options(listed later in the
>>> email) for allowing a partition to drop the partitions.  However, It works
>>> only in two modes (auto, custom) of three mode(auto_rebalance, auto,
>>> custom) the controller can support.
>>> More info about modes here
>>> http://helix.incubator.apache.org/Features.html
>>>
>>> Can you let me know which mode you are running it in?.
>>>
>>
>> We are planning to use CUSTOM mode since we have some specific
>> requirements about (1) the desired state for each partition, and (2)
>> scheduling partitions to instances.  Our requirements for (1) are not
>> expressible in the FSM framework.
>>
>> Also is it sufficient if the disabled partitions are re-assigned
>>> uniformly to other nodes or you want to other partitions from other nodes
>>> to be assigned to this node.
>>>
>>
>> Once a participant disables some partitions, it's alright for the default
>> rebalancing logic to kick in.
>>
>>
>>>
>>> Also it will help us if you can tell the use case when you need this
>>> feature.
>>>
>>
>> Sure, I'm still trying to hash things out but here is a summary.  The DDS
>> nodes are C++ processes, which is the crux of the problem.  AFAIK Helix
>> does not have a C++ library, so I'm planning to use a participant written
>> in Java, which runs as a separate process on the same node, receives state
>> transitions from the controller, and proxies them to the C++ process via an
>> RPC interface.  The problem is that the C++ process and the Java-Helix
>> proxy can fail independently.  I'm not worried about the Java-Helix proxy
>> crashing since that would knock of all partitions in the C++ process from
>> Helix view, which does not affect correctness.
>>
>> But when the C++ process crashes, the Java-Helix proxy needs to let the
>> controller know asap, so the Helix "external view" can be updated,
>> rebalancing can start, etc.  One alternative is to invoke
>> "manager.disconnect()" from the Helix proxy.  But this would knock off all
>> partitions managed by the proxy (I want to retain the ability for the proxy
>> to manage multiple C++ programs).  Hence the question about selectively
>> dropping certain partitions, viz., the ones in a crashed C++ program.
>>
>>
>>> To summarize, you can achieve this in AUTO and CUSTOM but not in
>>> AUTO_REBALANCE mode because the goal of controller is always to assign the
>>> partitions evenly among nodes. But you bring up a good use case, depending
>>> the  behavior we might be able to support it easily.
>>>
>>> 1. Disable a partition on a given node: Disabling a partition on a
>>> particular node should automatically trigger rebalancing. This can be done
>>> either by admin using command line tool
>>> helix-admin.sh --zkSvr <ZookeeperServerAddress(Required)>
>>> --enablePartition <clusterName instanceName resourceName partitionName
>>> true/false>
>>>
>>> or programmatically if you have the access to manager, you can invoke
>>> this
>>>
>>> manager.getClusterManagementTool().enablePartition(enabled,
>>>
>>> clusterName,instanceName,resourceName,partitionNames);
>>>
>>> This can be done in auto and custom.
>>>
>> I am not sure this will have the right effect in the scenario described
>> above.  Specifically, the Java proxy would need to disable all the crashed
>> partitions, and then re-enable them when the C++ DDS process reboots
>> successfully.  If the disable-enable transitions happen too quickly, could
>> the controller possibly miss the transition for some partition and not do
>> anything?
>>
>>> 2. The other option is to change the mapping of partition --> node in
>>> the ideal state. ( You can do this programmatically in custom modes and in
>>> some what in auto mode as well). Doing this will send transitions to the
>>> node to drop the partitions and reassign it to other nodes.
>>>
>>
>> Yes, this seems like the most logical thing.  The Java proxy will
>> probably need to send a message to the controller to trigger this change in
>> the ideal states of all crashed partitions.  The messaging API would
>> probably be useful here.
>>
>> Another alternative I'm considering is for the Java proxy to add a "fake"
>> instance for each C++ process that it spawns locally.  The custom
>> rebalancer (that I'd write inside the controller) would then schedule the
>> C++ DDS partitions on to these "fake" live instances.  When the C++ process
>> crashes, the Java proxy would simply disconnect the corresponding fake
>> instance's manager.  Does this make sense to you?  Or do you have any other
>> thoughts?
>>
>> Thanks again for your thoughtful feedback!
>> Abhishek
>>
>>
>>>
>>>
>>>
>>> Thanks,
>>> Kishore G
>>>
>>>
>>>
>>> On Fri, Feb 1, 2013 at 5:44 PM, Abhishek Rai <ab...@gmail.com>wrote:
>>>
>>>> Hi Helix users!
>>>>
>>>> I'm a Helix newbie and need some advice about a use case.  I'm using
>>>> Helix to manage a storage system which fits the description of a DDS
>>>> ("distributed data service" as defined in the Helix SOCC paper).  Each
>>>> participant hosts a bunch of partitions of a resource, as assigned by the
>>>> controller.  The set of partitions assigned to a participant changes
>>>> dynamically as the controller rebalances partitions, nodes join or leave,
>>>> etc.
>>>>
>>>> Additionally, I need the ability for a participant to "drop" a subset
>>>> of partitions currently assigned to it.  When a partition is dropped by a
>>>> participant, Helix would remove the partition from the current state of the
>>>> instance, update the external view, and make the partition available for
>>>> rebalancing by the controller.  Does the Java API provide a way of
>>>> accomplishing this?  If not, are there any workarounds?  Or, was there a
>>>> design rationale to disallow such actions from the participant?
>>>>
>>>> Thanks,
>>>> Abhishek
>>>>
>>>
>>>
>>
>

Re: Question about participant dropping a subset of assigned partitions

Posted by Terence Yim <ch...@gmail.com>.

Hi,

What do mean by "fake" live instance that you mentioned? I think the Java
proxy could simply creates one HelixManager participant per C++ instance
(hence there are N HelixManager instances in the Java proxy) and disconnect
them accordingly based on the liveness of the C++ process.

Terence

On Fri, Feb 1, 2013 at 7:19 PM, Abhishek Rai <ab...@gmail.com> wrote:

> Thanks for the quick and thoughtful response Kishore!  Comments inline.
>
>
> On Fri, Feb 1, 2013 at 6:33 PM, kishore g <g....@gmail.com> wrote:
>
>> Hi Abhishek,
>>
>> Thanks for the good question. We have two options(listed later in the
>> email) for allowing a partition to drop the partitions.  However, It works
>> only in two modes (auto, custom) of three mode(auto_rebalance, auto,
>> custom) the controller can support.
>> More info about modes here
>> http://helix.incubator.apache.org/Features.html
>>
>> Can you let me know which mode you are running it in?.
>>
>
> We are planning to use CUSTOM mode since we have some specific
> requirements about (1) the desired state for each partition, and (2)
> scheduling partitions to instances.  Our requirements for (1) are not
> expressible in the FSM framework.
>
> Also is it sufficient if the disabled partitions are re-assigned uniformly
>> to other nodes or you want to other partitions from other nodes to be
>> assigned to this node.
>>
>
> Once a participant disables some partitions, it's alright for the default
> rebalancing logic to kick in.
>
>
>>
>> Also it will help us if you can tell the use case when you need this
>> feature.
>>
>
> Sure, I'm still trying to hash things out but here is a summary.  The DDS
> nodes are C++ processes, which is the crux of the problem.  AFAIK Helix
> does not have a C++ library, so I'm planning to use a participant written
> in Java, which runs as a separate process on the same node, receives state
> transitions from the controller, and proxies them to the C++ process via an
> RPC interface.  The problem is that the C++ process and the Java-Helix
> proxy can fail independently.  I'm not worried about the Java-Helix proxy
> crashing since that would knock of all partitions in the C++ process from
> Helix view, which does not affect correctness.
>
> But when the C++ process crashes, the Java-Helix proxy needs to let the
> controller know asap, so the Helix "external view" can be updated,
> rebalancing can start, etc.  One alternative is to invoke
> "manager.disconnect()" from the Helix proxy.  But this would knock off all
> partitions managed by the proxy (I want to retain the ability for the proxy
> to manage multiple C++ programs).  Hence the question about selectively
> dropping certain partitions, viz., the ones in a crashed C++ program.
>
>
>> To summarize, you can achieve this in AUTO and CUSTOM but not in
>> AUTO_REBALANCE mode because the goal of controller is always to assign the
>> partitions evenly among nodes. But you bring up a good use case, depending
>> the  behavior we might be able to support it easily.
>>
>> 1. Disable a partition on a given node: Disabling a partition on a
>> particular node should automatically trigger rebalancing. This can be done
>> either by admin using command line tool
>> helix-admin.sh --zkSvr <ZookeeperServerAddress(Required)>
>> --enablePartition <clusterName instanceName resourceName partitionName
>> true/false>
>>
>> or programmatically if you have the access to manager, you can invoke this
>>
>> manager.getClusterManagementTool().enablePartition(enabled,
>>
>> clusterName,instanceName,resourceName,partitionNames);
>>
>> This can be done in auto and custom.
>>
> I am not sure this will have the right effect in the scenario described
> above.  Specifically, the Java proxy would need to disable all the crashed
> partitions, and then re-enable them when the C++ DDS process reboots
> successfully.  If the disable-enable transitions happen too quickly, could
> the controller possibly miss the transition for some partition and not do
> anything?
>
>> 2. The other option is to change the mapping of partition --> node in the
>> ideal state. ( You can do this programmatically in custom modes and in some
>> what in auto mode as well). Doing this will send transitions to the node to
>> drop the partitions and reassign it to other nodes.
>>
>
> Yes, this seems like the most logical thing.  The Java proxy will probably
> need to send a message to the controller to trigger this change in the
> ideal states of all crashed partitions.  The messaging API would probably
> be useful here.
>
> Another alternative I'm considering is for the Java proxy to add a "fake"
> instance for each C++ process that it spawns locally.  The custom
> rebalancer (that I'd write inside the controller) would then schedule the
> C++ DDS partitions on to these "fake" live instances.  When the C++ process
> crashes, the Java proxy would simply disconnect the corresponding fake
> instance's manager.  Does this make sense to you?  Or do you have any other
> thoughts?
>
> Thanks again for your thoughtful feedback!
> Abhishek
>
>
>>
>>
>>
>> Thanks,
>> Kishore G
>>
>>
>>
>> On Fri, Feb 1, 2013 at 5:44 PM, Abhishek Rai <ab...@gmail.com>wrote:
>>
>>> Hi Helix users!
>>>
>>> I'm a Helix newbie and need some advice about a use case.  I'm using
>>> Helix to manage a storage system which fits the description of a DDS
>>> ("distributed data service" as defined in the Helix SOCC paper).  Each
>>> participant hosts a bunch of partitions of a resource, as assigned by the
>>> controller.  The set of partitions assigned to a participant changes
>>> dynamically as the controller rebalances partitions, nodes join or leave,
>>> etc.
>>>
>>> Additionally, I need the ability for a participant to "drop" a subset of
>>> partitions currently assigned to it.  When a partition is dropped by a
>>> participant, Helix would remove the partition from the current state of the
>>> instance, update the external view, and make the partition available for
>>> rebalancing by the controller.  Does the Java API provide a way of
>>> accomplishing this?  If not, are there any workarounds?  Or, was there a
>>> design rationale to disallow such actions from the participant?
>>>
>>> Thanks,
>>> Abhishek
>>>
>>
>>
>

Re: Question about participant dropping a subset of assigned partitions

Posted by Abhishek Rai <ab...@gmail.com>.

Thanks for the quick and thoughtful response Kishore!  Comments inline.

On Fri, Feb 1, 2013 at 6:33 PM, kishore g <g....@gmail.com> wrote:

> Hi Abhishek,
>
> Thanks for the good question. We have two options(listed later in the
> email) for allowing a partition to drop the partitions.  However, It works
> only in two modes (auto, custom) of three mode(auto_rebalance, auto,
> custom) the controller can support.
> More info about modes here http://helix.incubator.apache.org/Features.html
>
> Can you let me know which mode you are running it in?.
>

We are planning to use CUSTOM mode since we have some specific requirements
about (1) the desired state for each partition, and (2) scheduling
partitions to instances.  Our requirements for (1) are not expressible in
the FSM framework.

Also is it sufficient if the disabled partitions are re-assigned uniformly
> to other nodes or you want to other partitions from other nodes to be
> assigned to this node.
>

Once a participant disables some partitions, it's alright for the default
rebalancing logic to kick in.

>
> Also it will help us if you can tell the use case when you need this
> feature.
>

Sure, I'm still trying to hash things out but here is a summary.  The DDS
nodes are C++ processes, which is the crux of the problem.  AFAIK Helix
does not have a C++ library, so I'm planning to use a participant written
in Java, which runs as a separate process on the same node, receives state
transitions from the controller, and proxies them to the C++ process via an
RPC interface.  The problem is that the C++ process and the Java-Helix
proxy can fail independently.  I'm not worried about the Java-Helix proxy
crashing since that would knock of all partitions in the C++ process from
Helix view, which does not affect correctness.

But when the C++ process crashes, the Java-Helix proxy needs to let the
controller know asap, so the Helix "external view" can be updated,
rebalancing can start, etc.  One alternative is to invoke
"manager.disconnect()" from the Helix proxy.  But this would knock off all
partitions managed by the proxy (I want to retain the ability for the proxy
to manage multiple C++ programs).  Hence the question about selectively
dropping certain partitions, viz., the ones in a crashed C++ program.

> To summarize, you can achieve this in AUTO and CUSTOM but not in
> AUTO_REBALANCE mode because the goal of controller is always to assign the
> partitions evenly among nodes. But you bring up a good use case, depending
> the  behavior we might be able to support it easily.
>
> 1. Disable a partition on a given node: Disabling a partition on a
> particular node should automatically trigger rebalancing. This can be done
> either by admin using command line tool
> helix-admin.sh --zkSvr <ZookeeperServerAddress(Required)>
> --enablePartition <clusterName instanceName resourceName partitionName
> true/false>
>
> or programmatically if you have the access to manager, you can invoke this
>
> manager.getClusterManagementTool().enablePartition(enabled,
>
> clusterName,instanceName,resourceName,partitionNames);
>
> This can be done in auto and custom.
>
I am not sure this will have the right effect in the scenario described
above.  Specifically, the Java proxy would need to disable all the crashed
partitions, and then re-enable them when the C++ DDS process reboots
successfully.  If the disable-enable transitions happen too quickly, could
the controller possibly miss the transition for some partition and not do
anything?

> 2. The other option is to change the mapping of partition --> node in the
> ideal state. ( You can do this programmatically in custom modes and in some
> what in auto mode as well). Doing this will send transitions to the node to
> drop the partitions and reassign it to other nodes.
>

Yes, this seems like the most logical thing.  The Java proxy will probably
need to send a message to the controller to trigger this change in the
ideal states of all crashed partitions.  The messaging API would probably
be useful here.

Another alternative I'm considering is for the Java proxy to add a "fake"
instance for each C++ process that it spawns locally.  The custom
rebalancer (that I'd write inside the controller) would then schedule the
C++ DDS partitions on to these "fake" live instances.  When the C++ process
crashes, the Java proxy would simply disconnect the corresponding fake
instance's manager.  Does this make sense to you?  Or do you have any other
thoughts?

Thanks again for your thoughtful feedback!
Abhishek

>
>
>
> Thanks,
> Kishore G
>
>
>
> On Fri, Feb 1, 2013 at 5:44 PM, Abhishek Rai <ab...@gmail.com>wrote:
>
>> Hi Helix users!
>>
>> I'm a Helix newbie and need some advice about a use case.  I'm using
>> Helix to manage a storage system which fits the description of a DDS
>> ("distributed data service" as defined in the Helix SOCC paper).  Each
>> participant hosts a bunch of partitions of a resource, as assigned by the
>> controller.  The set of partitions assigned to a participant changes
>> dynamically as the controller rebalances partitions, nodes join or leave,
>> etc.
>>
>> Additionally, I need the ability for a participant to "drop" a subset of
>> partitions currently assigned to it.  When a partition is dropped by a
>> participant, Helix would remove the partition from the current state of the
>> instance, update the external view, and make the partition available for
>> rebalancing by the controller.  Does the Java API provide a way of
>> accomplishing this?  If not, are there any workarounds?  Or, was there a
>> design rationale to disallow such actions from the participant?
>>
>> Thanks,
>> Abhishek
>>
>
>

Re: Question about participant dropping a subset of assigned partitions

Posted by kishore g <g....@gmail.com>.

Hi Abhishek,

Thanks for the good question. We have two options(listed later in the
email) for allowing a partition to drop the partitions.  However, It works
only in two modes (auto, custom) of three mode(auto_rebalance, auto,
custom) the controller can support.
More info about modes here http://helix.incubator.apache.org/Features.html

Can you let me know which mode you are running it in?. Also is it
sufficient if the disabled partitions are re-assigned uniformly to other
nodes or you want to other partitions from other nodes to be assigned to
this node.

Also it will help us if you can tell the use case when you need this
feature.

To summarize, you can achieve this in AUTO and CUSTOM but not in
AUTO_REBALANCE mode because the goal of controller is always to assign the
partitions evenly among nodes. But you bring up a good use case, depending
the  behavior we might be able to support it easily.

1. Disable a partition on a given node: Disabling a partition on a
particular node should automatically trigger rebalancing. This can be done
either by admin using command line tool
helix-admin.sh --zkSvr <ZookeeperServerAddress(Required)> --enablePartition
<clusterName instanceName resourceName partitionName true/false>

or programmatically if you have the access to manager, you can invoke this

manager.getClusterManagementTool().enablePartition(enabled,

clusterName,instanceName,resourceName,partitionNames);

This can be done in auto and custom.

2. The other option is to change the mapping of partition --> node in the
ideal state. ( You can do this programmatically in custom modes and in some
what in auto mode as well). Doing this will send transitions to the node to
drop the partitions and reassign it to other nodes.

Thanks,
Kishore G

On Fri, Feb 1, 2013 at 5:44 PM, Abhishek Rai <ab...@gmail.com> wrote:

> Hi Helix users!
>
> I'm a Helix newbie and need some advice about a use case.  I'm using Helix
> to manage a storage system which fits the description of a DDS
> ("distributed data service" as defined in the Helix SOCC paper).  Each
> participant hosts a bunch of partitions of a resource, as assigned by the
> controller.  The set of partitions assigned to a participant changes
> dynamically as the controller rebalances partitions, nodes join or leave,
> etc.
>
> Additionally, I need the ability for a participant to "drop" a subset of
> partitions currently assigned to it.  When a partition is dropped by a
> participant, Helix would remove the partition from the current state of the
> instance, update the external view, and make the partition available for
> rebalancing by the controller.  Does the Java API provide a way of
> accomplishing this?  If not, are there any workarounds?  Or, was there a
> design rationale to disallow such actions from the participant?
>
> Thanks,
> Abhishek
>