You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Michael Park <mp...@apache.org> on 2017/02/27 22:18:53 UTC
Review Request 57109: Re-checkpointed the `Executor`s and `Task`s
during agent recovery.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57109/
-----------------------------------------------------------
Review request for mesos and Benjamin Mahler.
Bugs: MESOS-7061
https://issues.apache.org/jira/browse/MESOS-7061
Repository: mesos
Description
-------
Re-checkpointed the tasks and executors during agent recovery by calling
`checkpointX` to `recoverX` functions for tasks and executors.
Diffs
-----
src/slave/slave.hpp 3b0aea4e3e9a17501077beccbccaab4abbe11af2
src/slave/slave.cpp fc480ae23ffa5cdeeb79b3621a08e1f8703bc01a
Diff: https://reviews.apache.org/r/57109/diff/
Testing
-------
Thanks,
Michael Park
Re: Review Request 57109: Re-checkpointed the `Executor`s and `Task`s
during agent recovery.
Posted by Benjamin Mahler <bm...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57109/#review167169
-----------------------------------------------------------
src/slave/slave.cpp (lines 5311 - 5315)
<https://reviews.apache.org/r/57109/#comment239342>
We could clarify here that in order to support of a scheduler upgrading to MULTI_ROLE and then changing its roles we need to do <this>.
src/slave/slave.cpp (lines 6948 - 6950)
<https://reviews.apache.org/r/57109/#comment239340>
It would be nice to avoid checkpointing every time we recover the agent, since we only need to re-checkpoint if any allocation info injection took place.
I was also going to suggest a comment here but I think once updated to conditional checkpointing it will be a bit more clear that we're doing this in support of the multi-role upgrade case (if not we probably want to clarify this).
- Benjamin Mahler
On Feb. 27, 2017, 10:18 p.m., Michael Park wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/57109/
> -----------------------------------------------------------
>
> (Updated Feb. 27, 2017, 10:18 p.m.)
>
>
> Review request for mesos and Benjamin Mahler.
>
>
> Bugs: MESOS-7061
> https://issues.apache.org/jira/browse/MESOS-7061
>
>
> Repository: mesos
>
>
> Description
> -------
>
> Re-checkpointed the tasks and executors during agent recovery by calling
> `checkpointX` to `recoverX` functions for tasks and executors.
>
>
> Diffs
> -----
>
> src/slave/slave.hpp 3b0aea4e3e9a17501077beccbccaab4abbe11af2
> src/slave/slave.cpp fc480ae23ffa5cdeeb79b3621a08e1f8703bc01a
>
> Diff: https://reviews.apache.org/r/57109/diff/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Michael Park
>
>
Re: Review Request 57109: Re-checkpointed the `Executor`s and `Task`s
during agent recovery.
Posted by Michael Park <mp...@apache.org>.
> On March 2, 2017, 3:11 p.m., Benjamin Mahler wrote:
> > src/slave/slave.hpp
> > Lines 1062 (patched)
> > <https://reviews.apache.org/r/57109/diff/2/?file=1654025#file1654025line1062>
> >
> > maybe `recheckpointExecutor`?
Took this recommendation and also updated to `recheckpointTask`.
> On March 2, 2017, 3:11 p.m., Benjamin Mahler wrote:
> > src/slave/slave.cpp
> > Lines 6973 (patched)
> > <https://reviews.apache.org/r/57109/diff/2/?file=1654026#file1654026line6976>
> >
> > Should we put this after the checkpoint call so that the ownership is a little more clear?
Done.
- Michael
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57109/#review167758
-----------------------------------------------------------
On March 3, 2017, 1:12 a.m., Michael Park wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/57109/
> -----------------------------------------------------------
>
> (Updated March 3, 2017, 1:12 a.m.)
>
>
> Review request for mesos and Benjamin Mahler.
>
>
> Bugs: MESOS-7061
> https://issues.apache.org/jira/browse/MESOS-7061
>
>
> Repository: mesos
>
>
> Description
> -------
>
> Re-checkpointed the `Executor`s and `Task`s during agent recovery.
>
>
> Diffs
> -----
>
> src/slave/slave.hpp 449971b6b343c7714e1d1167a55bbdfe94d2cf83
> src/slave/slave.cpp 6ae9458cc81a7623a1837cd636156434a972004b
>
>
> Diff: https://reviews.apache.org/r/57109/diff/3/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Michael Park
>
>
Re: Review Request 57109: Re-checkpointed the `Executor`s and `Task`s
during agent recovery.
Posted by Benjamin Mahler <bm...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57109/#review167758
-----------------------------------------------------------
Fix it, then Ship it!
src/slave/slave.hpp
Lines 416-417 (patched)
<https://reviews.apache.org/r/57109/#comment239720>
How about `executorsToRecheckpoint` and `tasksToRecheckpoint`?
src/slave/slave.hpp
Lines 1062 (patched)
<https://reviews.apache.org/r/57109/#comment239721>
maybe `recheckpointExecutor`?
src/slave/slave.hpp
Lines 1063 (patched)
<https://reviews.apache.org/r/57109/#comment239722>
Ditto here.
src/slave/slave.cpp
Lines 6973 (patched)
<https://reviews.apache.org/r/57109/#comment239723>
Should we put this after the checkpoint call so that the ownership is a little more clear?
- Benjamin Mahler
On March 2, 2017, 9:48 p.m., Michael Park wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/57109/
> -----------------------------------------------------------
>
> (Updated March 2, 2017, 9:48 p.m.)
>
>
> Review request for mesos and Benjamin Mahler.
>
>
> Bugs: MESOS-7061
> https://issues.apache.org/jira/browse/MESOS-7061
>
>
> Repository: mesos
>
>
> Description
> -------
>
> Re-checkpointed the `Executor`s and `Task`s during agent recovery.
>
>
> Diffs
> -----
>
> src/slave/slave.hpp 449971b6b343c7714e1d1167a55bbdfe94d2cf83
> src/slave/slave.cpp 6ae9458cc81a7623a1837cd636156434a972004b
>
>
> Diff: https://reviews.apache.org/r/57109/diff/2/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Michael Park
>
>
Re: Review Request 57109: Re-checkpointed the `Executor`s and `Task`s
during agent recovery.
Posted by Michael Park <mp...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57109/
-----------------------------------------------------------
(Updated March 3, 2017, 1:12 a.m.)
Review request for mesos and Benjamin Mahler.
Changes
-------
Addressed bmahler's comments.
Bugs: MESOS-7061
https://issues.apache.org/jira/browse/MESOS-7061
Repository: mesos
Description
-------
Re-checkpointed the `Executor`s and `Task`s during agent recovery.
Diffs (updated)
-----
src/slave/slave.hpp 449971b6b343c7714e1d1167a55bbdfe94d2cf83
src/slave/slave.cpp 6ae9458cc81a7623a1837cd636156434a972004b
Diff: https://reviews.apache.org/r/57109/diff/3/
Changes: https://reviews.apache.org/r/57109/diff/2-3/
Testing
-------
Thanks,
Michael Park
Re: Review Request 57109: Re-checkpointed the `Executor`s and `Task`s
during agent recovery.
Posted by Michael Park <mp...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57109/
-----------------------------------------------------------
(Updated March 2, 2017, 1:48 p.m.)
Review request for mesos and Benjamin Mahler.
Changes
-------
Addressed bmahler's comments.
Bugs: MESOS-7061
https://issues.apache.org/jira/browse/MESOS-7061
Repository: mesos
Description (updated)
-------
Re-checkpointed the `Executor`s and `Task`s during agent recovery.
Diffs (updated)
-----
src/slave/slave.hpp 449971b6b343c7714e1d1167a55bbdfe94d2cf83
src/slave/slave.cpp 6ae9458cc81a7623a1837cd636156434a972004b
Diff: https://reviews.apache.org/r/57109/diff/2/
Changes: https://reviews.apache.org/r/57109/diff/1-2/
Testing
-------
Thanks,
Michael Park