You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@helix.apache.org by kishore g <g....@gmail.com> on 2013/02/21 01:28:48 UTC

Re: Carrying over previous session state in new participant

Hi Abhishek,

We only carry over the fact that the participant hosted that partition. The
state of that partition will be reset to initial state( default:OFFLINE).
The idea behind this design was to detect resource deletion when the
participant was down and inform that participant when it comes up to drop
data or any local state associated with that partition. Once the drop
notification is handled, it will be removed from current state and external
view.

Can you confirm that resetting the state to OFFLINE after restart is a
problem in your case.

If you really need to avoid this behavior then you can implement
preConnectCallback and remove the previous session info. This wont be a
problem with future Helix version but you will have to still confirm that
old participant is dead. A better way would be provide a way to explicitly
specify a flag to not carry over the previous state. Can you please file a
jira for this. I can imagine this being useful in various use cases.

thanks,
Kishore G


On Wed, Feb 20, 2013 at 3:58 PM, Abhishek Rai <ab...@gmail.com> wrote:

> Hi Helix devs,
>
> Currently, when creating a session for a new participant, Helix carries
> over current states of assigned partitions from previous session of the
> same participant.  I think this may be undesirable for deployments where
> Helix session and assigned partitions by the participant are tightly
> coupled.  Assume that in such a setup, when a participant loses a session,
> it also loses all associated partitions.
>
> In this scenario, when the participant is restarted, and tries to reconnect
> to Helix, ZKHelixManager (handleNewSessionAsParticipant) currently "carries
> over" assignments from the previous session, which may not reflect true
> state of the restarted participant.  Is there an easy way to not carry over
> the state, in other words, start from scratch with no assigned partitions
> ?  If not, can you think of any possible workarounds?  I'm considering
> directly clearing old "current states" from Zookeeper.  I'd avoid doing
> this for multiple reasons: (1) compatibility with future Helix versions,
> (2) complexity: need to make sure old participant is really dead.
>
> Thanks,
> Abhishek
>

Re: Carrying over previous session state in new participant

Posted by kishore g <g....@gmail.com>.
> Answer to "who is responsible for resource detection"
Controller automatically detects the resource deletion and invokes the drop
transition. There is no need to change the Idealstate to indicate it to be
in DROPPED state.

Basically, controller checks compares idealstate of a resource and the
current state of the participant. If it detects the participant hosting a
resource that it is not supposed to, it automatically sends a transition to
drop that resource.

> Will controller automatically generate the transition from OFFLINE to old
idealstate.
Yes, if the idealstate remains same, controller will move the participant
back to state defined in idealstate.



On Thu, Feb 21, 2013 at 2:57 PM, Abhishek Rai <ab...@gmail.com> wrote:

> Thanks for explaining the semantics Kishore.  Comments inline.
>
>
> On Wed, Feb 20, 2013 at 4:28 PM, kishore g <g....@gmail.com> wrote:
>
> > Hi Abhishek,
> >
> > We only carry over the fact that the participant hosted that partition.
> The
> > state of that partition will be reset to initial state( default:OFFLINE).
> >
>
> I see, makes sense.
>
>
> > The idea behind this design was to detect resource deletion when the
> > participant was down and inform that participant when it comes up to drop
> > data or any local state associated with that partition. Once the drop
> > notification is handled, it will be removed from current state and
> external
> > view.
> >
>
> I see.  But who is responsible for "detecting resource deletion"?  Does the
> controller automatically set the ideal state as DROPPED for all partitions
> on a restarted instance?  Or is it the DDS' responsibility to detect that
> an instance is down and therefore set ideal state to DROPPED for all its
> hosted partitions.
>
>
> >
> > Can you confirm that resetting the state to OFFLINE after restart is a
> > problem in your case.
> >
>
> In my case, the DDS was getting confused by the partition's current state
> automatically recycling back to the initial state.  Besides, I wonder if
> the controller will automatically start generating transitions from OFFLINE
> towards the old ideal state (assuming the ideal state was not modified
> after the instance died).
>
>
> >
> > If you really need to avoid this behavior then you can implement
> > preConnectCallback and remove the previous session info. This wont be a
> > problem with future Helix version but you will have to still confirm that
> > old participant is dead. A better way would be provide a way to
> explicitly
> > specify a flag to not carry over the previous state. Can you please file
> a
> > jira for this. I can imagine this being useful in various use cases.
> >
>
> Thanks for the suggestion.  I'll file a jira.
>
>
> >
> > thanks,
> > Kishore G
> >
> >
> > On Wed, Feb 20, 2013 at 3:58 PM, Abhishek Rai <ab...@gmail.com>
> > wrote:
> >
> > > Hi Helix devs,
> > >
> > > Currently, when creating a session for a new participant, Helix carries
> > > over current states of assigned partitions from previous session of the
> > > same participant.  I think this may be undesirable for deployments
> where
> > > Helix session and assigned partitions by the participant are tightly
> > > coupled.  Assume that in such a setup, when a participant loses a
> > session,
> > > it also loses all associated partitions.
> > >
> > > In this scenario, when the participant is restarted, and tries to
> > reconnect
> > > to Helix, ZKHelixManager (handleNewSessionAsParticipant) currently
> > "carries
> > > over" assignments from the previous session, which may not reflect true
> > > state of the restarted participant.  Is there an easy way to not carry
> > over
> > > the state, in other words, start from scratch with no assigned
> partitions
> > > ?  If not, can you think of any possible workarounds?  I'm considering
> > > directly clearing old "current states" from Zookeeper.  I'd avoid doing
> > > this for multiple reasons: (1) compatibility with future Helix
> versions,
> > > (2) complexity: need to make sure old participant is really dead.
> > >
> > > Thanks,
> > > Abhishek
> > >
> >
>

Re: Carrying over previous session state in new participant

Posted by kishore g <g....@gmail.com>.
> Answer to "who is responsible for resource detection"
Controller automatically detects the resource deletion and invokes the drop
transition. There is no need to change the Idealstate to indicate it to be
in DROPPED state.

Basically, controller checks compares idealstate of a resource and the
current state of the participant. If it detects the participant hosting a
resource that it is not supposed to, it automatically sends a transition to
drop that resource.

> Will controller automatically generate the transition from OFFLINE to old
idealstate.
Yes, if the idealstate remains same, controller will move the participant
back to state defined in idealstate.



On Thu, Feb 21, 2013 at 2:57 PM, Abhishek Rai <ab...@gmail.com> wrote:

> Thanks for explaining the semantics Kishore.  Comments inline.
>
>
> On Wed, Feb 20, 2013 at 4:28 PM, kishore g <g....@gmail.com> wrote:
>
> > Hi Abhishek,
> >
> > We only carry over the fact that the participant hosted that partition.
> The
> > state of that partition will be reset to initial state( default:OFFLINE).
> >
>
> I see, makes sense.
>
>
> > The idea behind this design was to detect resource deletion when the
> > participant was down and inform that participant when it comes up to drop
> > data or any local state associated with that partition. Once the drop
> > notification is handled, it will be removed from current state and
> external
> > view.
> >
>
> I see.  But who is responsible for "detecting resource deletion"?  Does the
> controller automatically set the ideal state as DROPPED for all partitions
> on a restarted instance?  Or is it the DDS' responsibility to detect that
> an instance is down and therefore set ideal state to DROPPED for all its
> hosted partitions.
>
>
> >
> > Can you confirm that resetting the state to OFFLINE after restart is a
> > problem in your case.
> >
>
> In my case, the DDS was getting confused by the partition's current state
> automatically recycling back to the initial state.  Besides, I wonder if
> the controller will automatically start generating transitions from OFFLINE
> towards the old ideal state (assuming the ideal state was not modified
> after the instance died).
>
>
> >
> > If you really need to avoid this behavior then you can implement
> > preConnectCallback and remove the previous session info. This wont be a
> > problem with future Helix version but you will have to still confirm that
> > old participant is dead. A better way would be provide a way to
> explicitly
> > specify a flag to not carry over the previous state. Can you please file
> a
> > jira for this. I can imagine this being useful in various use cases.
> >
>
> Thanks for the suggestion.  I'll file a jira.
>
>
> >
> > thanks,
> > Kishore G
> >
> >
> > On Wed, Feb 20, 2013 at 3:58 PM, Abhishek Rai <ab...@gmail.com>
> > wrote:
> >
> > > Hi Helix devs,
> > >
> > > Currently, when creating a session for a new participant, Helix carries
> > > over current states of assigned partitions from previous session of the
> > > same participant.  I think this may be undesirable for deployments
> where
> > > Helix session and assigned partitions by the participant are tightly
> > > coupled.  Assume that in such a setup, when a participant loses a
> > session,
> > > it also loses all associated partitions.
> > >
> > > In this scenario, when the participant is restarted, and tries to
> > reconnect
> > > to Helix, ZKHelixManager (handleNewSessionAsParticipant) currently
> > "carries
> > > over" assignments from the previous session, which may not reflect true
> > > state of the restarted participant.  Is there an easy way to not carry
> > over
> > > the state, in other words, start from scratch with no assigned
> partitions
> > > ?  If not, can you think of any possible workarounds?  I'm considering
> > > directly clearing old "current states" from Zookeeper.  I'd avoid doing
> > > this for multiple reasons: (1) compatibility with future Helix
> versions,
> > > (2) complexity: need to make sure old participant is really dead.
> > >
> > > Thanks,
> > > Abhishek
> > >
> >
>

Re: Carrying over previous session state in new participant

Posted by Abhishek Rai <ab...@gmail.com>.
Thanks for explaining the semantics Kishore.  Comments inline.


On Wed, Feb 20, 2013 at 4:28 PM, kishore g <g....@gmail.com> wrote:

> Hi Abhishek,
>
> We only carry over the fact that the participant hosted that partition. The
> state of that partition will be reset to initial state( default:OFFLINE).
>

I see, makes sense.


> The idea behind this design was to detect resource deletion when the
> participant was down and inform that participant when it comes up to drop
> data or any local state associated with that partition. Once the drop
> notification is handled, it will be removed from current state and external
> view.
>

I see.  But who is responsible for "detecting resource deletion"?  Does the
controller automatically set the ideal state as DROPPED for all partitions
on a restarted instance?  Or is it the DDS' responsibility to detect that
an instance is down and therefore set ideal state to DROPPED for all its
hosted partitions.


>
> Can you confirm that resetting the state to OFFLINE after restart is a
> problem in your case.
>

In my case, the DDS was getting confused by the partition's current state
automatically recycling back to the initial state.  Besides, I wonder if
the controller will automatically start generating transitions from OFFLINE
towards the old ideal state (assuming the ideal state was not modified
after the instance died).


>
> If you really need to avoid this behavior then you can implement
> preConnectCallback and remove the previous session info. This wont be a
> problem with future Helix version but you will have to still confirm that
> old participant is dead. A better way would be provide a way to explicitly
> specify a flag to not carry over the previous state. Can you please file a
> jira for this. I can imagine this being useful in various use cases.
>

Thanks for the suggestion.  I'll file a jira.


>
> thanks,
> Kishore G
>
>
> On Wed, Feb 20, 2013 at 3:58 PM, Abhishek Rai <ab...@gmail.com>
> wrote:
>
> > Hi Helix devs,
> >
> > Currently, when creating a session for a new participant, Helix carries
> > over current states of assigned partitions from previous session of the
> > same participant.  I think this may be undesirable for deployments where
> > Helix session and assigned partitions by the participant are tightly
> > coupled.  Assume that in such a setup, when a participant loses a
> session,
> > it also loses all associated partitions.
> >
> > In this scenario, when the participant is restarted, and tries to
> reconnect
> > to Helix, ZKHelixManager (handleNewSessionAsParticipant) currently
> "carries
> > over" assignments from the previous session, which may not reflect true
> > state of the restarted participant.  Is there an easy way to not carry
> over
> > the state, in other words, start from scratch with no assigned partitions
> > ?  If not, can you think of any possible workarounds?  I'm considering
> > directly clearing old "current states" from Zookeeper.  I'd avoid doing
> > this for multiple reasons: (1) compatibility with future Helix versions,
> > (2) complexity: need to make sure old participant is really dead.
> >
> > Thanks,
> > Abhishek
> >
>

Re: Carrying over previous session state in new participant

Posted by Abhishek Rai <ab...@gmail.com>.
Thanks for explaining the semantics Kishore.  Comments inline.


On Wed, Feb 20, 2013 at 4:28 PM, kishore g <g....@gmail.com> wrote:

> Hi Abhishek,
>
> We only carry over the fact that the participant hosted that partition. The
> state of that partition will be reset to initial state( default:OFFLINE).
>

I see, makes sense.


> The idea behind this design was to detect resource deletion when the
> participant was down and inform that participant when it comes up to drop
> data or any local state associated with that partition. Once the drop
> notification is handled, it will be removed from current state and external
> view.
>

I see.  But who is responsible for "detecting resource deletion"?  Does the
controller automatically set the ideal state as DROPPED for all partitions
on a restarted instance?  Or is it the DDS' responsibility to detect that
an instance is down and therefore set ideal state to DROPPED for all its
hosted partitions.


>
> Can you confirm that resetting the state to OFFLINE after restart is a
> problem in your case.
>

In my case, the DDS was getting confused by the partition's current state
automatically recycling back to the initial state.  Besides, I wonder if
the controller will automatically start generating transitions from OFFLINE
towards the old ideal state (assuming the ideal state was not modified
after the instance died).


>
> If you really need to avoid this behavior then you can implement
> preConnectCallback and remove the previous session info. This wont be a
> problem with future Helix version but you will have to still confirm that
> old participant is dead. A better way would be provide a way to explicitly
> specify a flag to not carry over the previous state. Can you please file a
> jira for this. I can imagine this being useful in various use cases.
>

Thanks for the suggestion.  I'll file a jira.


>
> thanks,
> Kishore G
>
>
> On Wed, Feb 20, 2013 at 3:58 PM, Abhishek Rai <ab...@gmail.com>
> wrote:
>
> > Hi Helix devs,
> >
> > Currently, when creating a session for a new participant, Helix carries
> > over current states of assigned partitions from previous session of the
> > same participant.  I think this may be undesirable for deployments where
> > Helix session and assigned partitions by the participant are tightly
> > coupled.  Assume that in such a setup, when a participant loses a
> session,
> > it also loses all associated partitions.
> >
> > In this scenario, when the participant is restarted, and tries to
> reconnect
> > to Helix, ZKHelixManager (handleNewSessionAsParticipant) currently
> "carries
> > over" assignments from the previous session, which may not reflect true
> > state of the restarted participant.  Is there an easy way to not carry
> over
> > the state, in other words, start from scratch with no assigned partitions
> > ?  If not, can you think of any possible workarounds?  I'm considering
> > directly clearing old "current states" from Zookeeper.  I'd avoid doing
> > this for multiple reasons: (1) compatibility with future Helix versions,
> > (2) complexity: need to make sure old participant is really dead.
> >
> > Thanks,
> > Abhishek
> >
>