You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mesos.apache.org by Megha Sharma <ms...@apple.com> on 2016/10/27 00:13:50 UTC

Design for Restartable Tasks

Hi All,

We have been working on the design to allow tasks which need to be restarted on the agent post its restart. Looking forward to your comments/feedback.

Design Doc:
https://docs.google.com/document/d/1YS_EBUNLkzpSru0dwn_hPUIeTATiWckSaosXSIaHUCo/edit#heading=h.tlevdyt3yv0a

JIRA:
https://issues.apache.org/jira/browse/MESOS-3545

Many Thanks
Megha Sharma





Re: Design for Restartable Tasks

Posted by Megha Sharma <ms...@apple.com>.
Hi All,

Thanks for your feedback on the design, here’s the revised design for Restartable Tasks.

https://docs.google.com/document/d/1epYCznSjevbiA776Yr72xx365IGEzKRnoLnXBorm0J0/edit?usp=sharing <https://docs.google.com/document/d/1epYCznSjevbiA776Yr72xx365IGEzKRnoLnXBorm0J0/edit?usp=sharing>

Based on the feedback we have had on the old design and discussion with a bunch of committers, we have added the restart by executor and taskgroup restart to the design. Looking forward to your comments/feedback.

Many Thanks
Megha Sharma

> On Oct 26, 2016, at 6:23 PM, Benjamin Mahler <bmahler@apache.org <ma...@apache.org>> wrote:
> 
> Thanks for publishing this! Saw some tickets being created and was wondering where this email was.. :)
> 
> The higher level thing that strikes me is that I think the notion of a task restart policy should be managed by the executor (i.e. the executor restarts the task based on the policy). This is aligned with how the existing kill and health check policies work. This project seems to be something more along the lines of a restartable executor, alongside a change to perform agent recovery across reboot?
> 
> Since this project is pretty complicated, it would be prudent to gather some committers to provide feedback and we can publish our notes to the lists.
> 
> Ben
> 
> On Wed, Oct 26, 2016 at 5:13 PM, Megha Sharma <msharma3@apple.com <ma...@apple.com>> wrote:
> Hi All,
> 
> We have been working on the design to allow tasks which need to be restarted on the agent post its restart. Looking forward to your comments/feedback.
> 
> Design Doc:
> https://docs.google.com/document/d/1YS_EBUNLkzpSru0dwn_hPUIeTATiWckSaosXSIaHUCo/edit#heading=h.tlevdyt3yv0a <https://docs.google.com/document/d/1YS_EBUNLkzpSru0dwn_hPUIeTATiWckSaosXSIaHUCo/edit#heading=h.tlevdyt3yv0a>
> 
> JIRA:
> https://issues.apache.org/jira/browse/MESOS-3545 <https://issues.apache.org/jira/browse/MESOS-3545>
> 
> Many Thanks
> Megha Sharma
> 
> 
> 
> 
> 


Re: Design for Restartable Tasks

Posted by Megha Sharma <ms...@apple.com>.
Hi All,

Thanks for your feedback on the design, here’s the revised design for Restartable Tasks.

https://docs.google.com/document/d/1epYCznSjevbiA776Yr72xx365IGEzKRnoLnXBorm0J0/edit?usp=sharing <https://docs.google.com/document/d/1epYCznSjevbiA776Yr72xx365IGEzKRnoLnXBorm0J0/edit?usp=sharing>

Based on the feedback we have had on the old design and discussion with a bunch of committers, we have added the restart by executor and taskgroup restart to the design. Looking forward to your comments/feedback.

Many Thanks
Megha Sharma

> On Oct 26, 2016, at 6:23 PM, Benjamin Mahler <bmahler@apache.org <ma...@apache.org>> wrote:
> 
> Thanks for publishing this! Saw some tickets being created and was wondering where this email was.. :)
> 
> The higher level thing that strikes me is that I think the notion of a task restart policy should be managed by the executor (i.e. the executor restarts the task based on the policy). This is aligned with how the existing kill and health check policies work. This project seems to be something more along the lines of a restartable executor, alongside a change to perform agent recovery across reboot?
> 
> Since this project is pretty complicated, it would be prudent to gather some committers to provide feedback and we can publish our notes to the lists.
> 
> Ben
> 
> On Wed, Oct 26, 2016 at 5:13 PM, Megha Sharma <msharma3@apple.com <ma...@apple.com>> wrote:
> Hi All,
> 
> We have been working on the design to allow tasks which need to be restarted on the agent post its restart. Looking forward to your comments/feedback.
> 
> Design Doc:
> https://docs.google.com/document/d/1YS_EBUNLkzpSru0dwn_hPUIeTATiWckSaosXSIaHUCo/edit#heading=h.tlevdyt3yv0a <https://docs.google.com/document/d/1YS_EBUNLkzpSru0dwn_hPUIeTATiWckSaosXSIaHUCo/edit#heading=h.tlevdyt3yv0a>
> 
> JIRA:
> https://issues.apache.org/jira/browse/MESOS-3545 <https://issues.apache.org/jira/browse/MESOS-3545>
> 
> Many Thanks
> Megha Sharma
> 
> 
> 
> 
> 


Re: Design for Restartable Tasks

Posted by Benjamin Mahler <bm...@apache.org>.
Thanks for publishing this! Saw some tickets being created and was
wondering where this email was.. :)

The higher level thing that strikes me is that I think the notion of a task
restart policy should be managed by the executor (i.e. the executor
restarts the task based on the policy). This is aligned with how the
existing kill and health check policies work. This project seems to be
something more along the lines of a restartable executor, alongside a
change to perform agent recovery across reboot?

Since this project is pretty complicated, it would be prudent to gather
some committers to provide feedback and we can publish our notes to the
lists.

Ben

On Wed, Oct 26, 2016 at 5:13 PM, Megha Sharma <ms...@apple.com> wrote:

> Hi All,
>
> We have been working on the design to allow tasks which need to be
> restarted on the agent post its restart. Looking forward to your
> comments/feedback.
>
> Design Doc:
> https://docs.google.com/document/d/1YS_EBUNLkzpSru0dwn_hPUIe
> TATiWckSaosXSIaHUCo/edit#heading=h.tlevdyt3yv0a
>
> JIRA:
> https://issues.apache.org/jira/browse/MESOS-3545
>
> Many Thanks
> Megha Sharma
>
>
>
>
>

Re: Design for Restartable Tasks

Posted by daemeon reiydelle <da...@gmail.com>.
I would argue there should be an option to NOT restart tasks until the
agent has recovered to the master. This would avoid the case where a node
is accidentally restarted after being shut down.


*.......*



*Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872*

On Wed, Oct 26, 2016 at 5:13 PM, Megha Sharma <ms...@apple.com> wrote:

> Hi All,
>
> We have been working on the design to allow tasks which need to be
> restarted on the agent post its restart. Looking forward to your
> comments/feedback.
>
> Design Doc:
> https://docs.google.com/document/d/1YS_EBUNLkzpSru0dwn_
> hPUIeTATiWckSaosXSIaHUCo/edit#heading=h.tlevdyt3yv0a
>
> JIRA:
> https://issues.apache.org/jira/browse/MESOS-3545
>
> Many Thanks
> Megha Sharma
>
>
>
>
>

Re: Design for Restartable Tasks

Posted by Benjamin Mahler <bm...@apache.org>.
Thanks for publishing this! Saw some tickets being created and was
wondering where this email was.. :)

The higher level thing that strikes me is that I think the notion of a task
restart policy should be managed by the executor (i.e. the executor
restarts the task based on the policy). This is aligned with how the
existing kill and health check policies work. This project seems to be
something more along the lines of a restartable executor, alongside a
change to perform agent recovery across reboot?

Since this project is pretty complicated, it would be prudent to gather
some committers to provide feedback and we can publish our notes to the
lists.

Ben

On Wed, Oct 26, 2016 at 5:13 PM, Megha Sharma <ms...@apple.com> wrote:

> Hi All,
>
> We have been working on the design to allow tasks which need to be
> restarted on the agent post its restart. Looking forward to your
> comments/feedback.
>
> Design Doc:
> https://docs.google.com/document/d/1YS_EBUNLkzpSru0dwn_hPUIe
> TATiWckSaosXSIaHUCo/edit#heading=h.tlevdyt3yv0a
>
> JIRA:
> https://issues.apache.org/jira/browse/MESOS-3545
>
> Many Thanks
> Megha Sharma
>
>
>
>
>