You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@myriad.apache.org by Paul Read <pd...@gmail.com> on 2015/04/01 11:31:01 UTC

Death of the RM

Is it reasonable to expect the Executor and NM to exit if the the
RM/Scheduler accidently dies or is killed? Or should a restart of the
RM/Scheduler re-sync with the running Executor/NM ?

I know there is currently no mechanism to do that but I was looking at
issue #55 and part of the problem/solution would be eliminated if the child
tasks were to terminate if the RM dies.

Re: Death of the RM

Posted by mohit soni <mo...@gmail.com>.
Adam,
I am not able to find time to continue my work on HA. Most of my changes
are already in phase1 branch, and some changes from review with Ken are in
"issue_16" branch.

Paul,
If Myriad is started with 'checkpoint: true' option in myriad-config yaml,
then if RM/Scheduler accidently dies the NM/Executor won't exit and will
not be killed my Mesos until 'frameworkFailoverTimeout' timeout again
specified in myriad-config yaml.

As Adam mentioned, restart of Scheduler/RM will trigger mesos task
reconciliation
that will re-sync the TASK state of NM tasks.

Now, about issue #55. You are right, if we start Myriad with 'checkpoint:
false', and the RM/Scheduler dies, then Mesos will take out any running
tasks (NM) launched my Myriad. But, unless it's a test cluster most folks
would like cluster to be resilient of RM/Scheduler failures, and would like
NM tasks to keep running until RM/Scheduler comes back.

And, you are right, as part of HA work, we would like to store task
statuses in ZK/Mesos log to allow recovery via reconciliation post RM
failure.

Regards
Mohit

On Wed, Apr 1, 2015 at 12:47 PM, Adam Bordelon <ad...@mesosphere.io> wrote:

> I know one of the things Mohit was working on was scheduler HA and task
> reconciliation, so that if the RM dies, then Marathon (or another PaaS)
> could restart it elsewhere, recover its state, and reconnect to running
> tasks. In this scenario, we definitely want to keep the executors/NMs
> running when the RM/scheduler exits.
> See https://github.com/mesos/myriad/issues/13 and
> https://github.com/mesos/myriad/issues/16
> Mohit, what's the status of this work? Do you have a branch you can share
> that others can continue on, if you don't have time to complete it
> yourself?
>
> RM HA via multiple RMs is a separate consideration:
>
> http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cdh_hag_rm_ha_config.html
> There are a few considerations with having multiple RMs (active+standby)
> hence multiple schedulers. Only one can be registered with Mesos at a time,
> so we'd want to ensure that the MyriadScheduler only registers with Mesos
> if it is the active RM. We'd also want to store/retrieve scheduler state
> (in ZK/wherever) when failing over to another RM/Scheduler, but we'll do
> that for general single-RM HA anyway.
>
> On Wed, Apr 1, 2015 at 5:33 AM, Paul Read <pd...@gmail.com> wrote:
>
> > If you don't mind adding a connection to zookeeper, storing the tasking
> > status by host and instance in zookeeper, cleaning up on a graceful RM
> die,
> > then you should be able to recover at virtually any point. And have
> > multiple RMs if that is a goal.
> >
> > Not sure at this point if the Executor would need to connect to zookeeper
> > or just the scheduler. At first glance I would think just the Scheduler
> > however if the RM accidentally dies and then the Executor is killed it
> may
> > be reasonable to have it update ZK with status...or just have any RM when
> > it comes up to re-sync by requesting a sync msg and if it does not get
> one
> > in a reasonable amount of time assume its dead...could go so far as to
> > track PIDs and test to see if they are out there as well.
> >
> > Just a few thoughts.
> >
> > On Wed, Apr 1, 2015 at 5:31 AM, Paul Read <pd...@gmail.com> wrote:
> >
> > >
> > > Is it reasonable to expect the Executor and NM to exit if the the
> > > RM/Scheduler accidently dies or is killed? Or should a restart of the
> > > RM/Scheduler re-sync with the running Executor/NM ?
> > >
> > > I know there is currently no mechanism to do that but I was looking at
> > > issue #55 and part of the problem/solution would be eliminated if the
> > child
> > > tasks were to terminate if the RM dies.
> > >
> > >
> > >
> >
>

Re: Death of the RM

Posted by Santosh Marella <sm...@maprtech.com>.
> Only one can be registered with Mesos at a time,
so we'd want to ensure that the MyriadScheduler only registers with Mesos
if it is the active RM.

This is already the behavior in Myriad. Only the active-RM loads a YARN
scheduler (FairScheduler/CapcityScheduler etc).
Since Myriad plugs into RM by extending a YARN scheduler, only the active
RM will initialize Myriad. Hence
Myriad registers with mesos only when there is a RM in "active" state.

Thanks,
Santosh

On Wed, Apr 1, 2015 at 12:47 PM, Adam Bordelon <ad...@mesosphere.io> wrote:

> I know one of the things Mohit was working on was scheduler HA and task
> reconciliation, so that if the RM dies, then Marathon (or another PaaS)
> could restart it elsewhere, recover its state, and reconnect to running
> tasks. In this scenario, we definitely want to keep the executors/NMs
> running when the RM/scheduler exits.
> See https://github.com/mesos/myriad/issues/13 and
> https://github.com/mesos/myriad/issues/16
> Mohit, what's the status of this work? Do you have a branch you can share
> that others can continue on, if you don't have time to complete it
> yourself?
>
> RM HA via multiple RMs is a separate consideration:
>
> http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cdh_hag_rm_ha_config.html
> There are a few considerations with having multiple RMs (active+standby)
> hence multiple schedulers. Only one can be registered with Mesos at a time,
> so we'd want to ensure that the MyriadScheduler only registers with Mesos
> if it is the active RM. We'd also want to store/retrieve scheduler state
> (in ZK/wherever) when failing over to another RM/Scheduler, but we'll do
> that for general single-RM HA anyway.
>
> On Wed, Apr 1, 2015 at 5:33 AM, Paul Read <pd...@gmail.com> wrote:
>
> > If you don't mind adding a connection to zookeeper, storing the tasking
> > status by host and instance in zookeeper, cleaning up on a graceful RM
> die,
> > then you should be able to recover at virtually any point. And have
> > multiple RMs if that is a goal.
> >
> > Not sure at this point if the Executor would need to connect to zookeeper
> > or just the scheduler. At first glance I would think just the Scheduler
> > however if the RM accidentally dies and then the Executor is killed it
> may
> > be reasonable to have it update ZK with status...or just have any RM when
> > it comes up to re-sync by requesting a sync msg and if it does not get
> one
> > in a reasonable amount of time assume its dead...could go so far as to
> > track PIDs and test to see if they are out there as well.
> >
> > Just a few thoughts.
> >
> > On Wed, Apr 1, 2015 at 5:31 AM, Paul Read <pd...@gmail.com> wrote:
> >
> > >
> > > Is it reasonable to expect the Executor and NM to exit if the the
> > > RM/Scheduler accidently dies or is killed? Or should a restart of the
> > > RM/Scheduler re-sync with the running Executor/NM ?
> > >
> > > I know there is currently no mechanism to do that but I was looking at
> > > issue #55 and part of the problem/solution would be eliminated if the
> > child
> > > tasks were to terminate if the RM dies.
> > >
> > >
> > >
> >
>

Re: Death of the RM

Posted by Adam Bordelon <ad...@mesosphere.io>.
I know one of the things Mohit was working on was scheduler HA and task
reconciliation, so that if the RM dies, then Marathon (or another PaaS)
could restart it elsewhere, recover its state, and reconnect to running
tasks. In this scenario, we definitely want to keep the executors/NMs
running when the RM/scheduler exits.
See https://github.com/mesos/myriad/issues/13 and
https://github.com/mesos/myriad/issues/16
Mohit, what's the status of this work? Do you have a branch you can share
that others can continue on, if you don't have time to complete it yourself?

RM HA via multiple RMs is a separate consideration:
http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cdh_hag_rm_ha_config.html
There are a few considerations with having multiple RMs (active+standby)
hence multiple schedulers. Only one can be registered with Mesos at a time,
so we'd want to ensure that the MyriadScheduler only registers with Mesos
if it is the active RM. We'd also want to store/retrieve scheduler state
(in ZK/wherever) when failing over to another RM/Scheduler, but we'll do
that for general single-RM HA anyway.

On Wed, Apr 1, 2015 at 5:33 AM, Paul Read <pd...@gmail.com> wrote:

> If you don't mind adding a connection to zookeeper, storing the tasking
> status by host and instance in zookeeper, cleaning up on a graceful RM die,
> then you should be able to recover at virtually any point. And have
> multiple RMs if that is a goal.
>
> Not sure at this point if the Executor would need to connect to zookeeper
> or just the scheduler. At first glance I would think just the Scheduler
> however if the RM accidentally dies and then the Executor is killed it may
> be reasonable to have it update ZK with status...or just have any RM when
> it comes up to re-sync by requesting a sync msg and if it does not get one
> in a reasonable amount of time assume its dead...could go so far as to
> track PIDs and test to see if they are out there as well.
>
> Just a few thoughts.
>
> On Wed, Apr 1, 2015 at 5:31 AM, Paul Read <pd...@gmail.com> wrote:
>
> >
> > Is it reasonable to expect the Executor and NM to exit if the the
> > RM/Scheduler accidently dies or is killed? Or should a restart of the
> > RM/Scheduler re-sync with the running Executor/NM ?
> >
> > I know there is currently no mechanism to do that but I was looking at
> > issue #55 and part of the problem/solution would be eliminated if the
> child
> > tasks were to terminate if the RM dies.
> >
> >
> >
>

Re: Death of the RM

Posted by Paul Read <pd...@gmail.com>.
If you don't mind adding a connection to zookeeper, storing the tasking
status by host and instance in zookeeper, cleaning up on a graceful RM die,
then you should be able to recover at virtually any point. And have
multiple RMs if that is a goal.

Not sure at this point if the Executor would need to connect to zookeeper
or just the scheduler. At first glance I would think just the Scheduler
however if the RM accidentally dies and then the Executor is killed it may
be reasonable to have it update ZK with status...or just have any RM when
it comes up to re-sync by requesting a sync msg and if it does not get one
in a reasonable amount of time assume its dead...could go so far as to
track PIDs and test to see if they are out there as well.

Just a few thoughts.

On Wed, Apr 1, 2015 at 5:31 AM, Paul Read <pd...@gmail.com> wrote:

>
> Is it reasonable to expect the Executor and NM to exit if the the
> RM/Scheduler accidently dies or is killed? Or should a restart of the
> RM/Scheduler re-sync with the running Executor/NM ?
>
> I know there is currently no mechanism to do that but I was looking at
> issue #55 and part of the problem/solution would be eliminated if the child
> tasks were to terminate if the RM dies.
>
>
>