You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by James Peach <jp...@apache.org> on 2017/09/28 00:20:10 UTC
Review Request 62642: Propagated the termination info down the
container tree.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62642/
-----------------------------------------------------------
Review request for mesos, Jie Yu and Qian Zhang.
Bugs: MESOS-7963
https://issues.apache.org/jira/browse/MESOS-7963
Repository: mesos
Description
-------
When the MesosContainerizer destroys a container tree, we need to
propagate the ContainerTermination down to all the child containers
so that any executor that is waiting for them can receive enough
information to send a useful status update.
Diffs
-----
src/slave/containerizer/mesos/containerizer.hpp cc23b4d91be16fc95a131c09d07378b801e34d6f
src/slave/containerizer/mesos/containerizer.cpp 4d5dc13f363f5d8886983d7dd06a5cecc177c345
Diff: https://reviews.apache.org/r/62642/diff/1/
Testing
-------
make check (Fedora 26)
Thanks,
James Peach
Re: Review Request 62642: Propagated the termination info down the
container tree.
Posted by James Peach <jp...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62642/
-----------------------------------------------------------
(Updated Oct. 10, 2017, 5:27 p.m.)
Review request for mesos, Jie Yu and Qian Zhang.
Changes
-------
Removed REASON_TASK_UNKNOWN usage.
Bugs: MESOS-7963
https://issues.apache.org/jira/browse/MESOS-7963
Repository: mesos
Description
-------
When the MesosContainerizer destroys a container tree, we need to
propagate the ContainerTermination down to all the child containers
so that any executor that is waiting for them can receive enough
information to send a useful status update.
Diffs (updated)
-----
src/slave/containerizer/mesos/containerizer.hpp cc23b4d91be16fc95a131c09d07378b801e34d6f
src/slave/containerizer/mesos/containerizer.cpp 4d5dc13f363f5d8886983d7dd06a5cecc177c345
Diff: https://reviews.apache.org/r/62642/diff/3/
Changes: https://reviews.apache.org/r/62642/diff/2-3/
Testing
-------
make check (Fedora 26)
Thanks,
James Peach
Re: Review Request 62642: Propagated the termination info down the
container tree.
Posted by Jie Yu <yu...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62642/#review186960
-----------------------------------------------------------
Ship it!
Ship It!
- Jie Yu
On Sept. 28, 2017, 12:20 a.m., James Peach wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62642/
> -----------------------------------------------------------
>
> (Updated Sept. 28, 2017, 12:20 a.m.)
>
>
> Review request for mesos, Jie Yu and Qian Zhang.
>
>
> Bugs: MESOS-7963
> https://issues.apache.org/jira/browse/MESOS-7963
>
>
> Repository: mesos
>
>
> Description
> -------
>
> When the MesosContainerizer destroys a container tree, we need to
> propagate the ContainerTermination down to all the child containers
> so that any executor that is waiting for them can receive enough
> information to send a useful status update.
>
>
> Diffs
> -----
>
> src/slave/containerizer/mesos/containerizer.hpp cc23b4d91be16fc95a131c09d07378b801e34d6f
> src/slave/containerizer/mesos/containerizer.cpp 4d5dc13f363f5d8886983d7dd06a5cecc177c345
>
>
> Diff: https://reviews.apache.org/r/62642/diff/1/
>
>
> Testing
> -------
>
> make check (Fedora 26)
>
>
> Thanks,
>
> James Peach
>
>
Re: Review Request 62642: Propagated the termination info down the
container tree.
Posted by James Peach <jp...@apache.org>.
> On Oct. 10, 2017, 2:17 p.m., Qian Zhang wrote:
> > src/slave/containerizer/mesos/containerizer.cpp
> > Lines 1006 (patched)
> > <https://reviews.apache.org/r/62642/diff/2/?file=1850012#file1850012line1006>
> >
> > In the [this doc](https://github.com/apache/mesos/blob/master/docs/task-state-reasons.md#unused-reasons), I see:
> > > The reasons REASON_CONTAINER_LIMITATION, REASON_INVALID_FRAMEWORKID, REASON_SLAVE_UNKNOWN, REASON_TASK_UNKNOWN and REASON_EXECUTOR_UNREGISTERED are not used as of Mesos 1.4.
> >
> > Since you have used `REASON_TASK_UNKNOWN` in the code here, I think we may need to remove it from the above statement, and change "Mesos 1.4" to "Mesos 1.5", and then put `REASON_TASK_UNKNOWN` into [this table](https://github.com/apache/mesos/blob/master/docs/task-state-reasons.md#for-state-task_failed).
Since this is only used for orphans, I don't think it is possible to deliver `REASON_TASK_UNKNOWN` to schedulers. For a scheduler to receive the `REASON_TASK_UNKNOWN` they would have had to call `wait()` on an orphan container, which can't happen AFAICT. Maybe it would be clearer to just pass `None()` here.
- James
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62642/#review187527
-----------------------------------------------------------
On Sept. 28, 2017, 12:20 a.m., James Peach wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62642/
> -----------------------------------------------------------
>
> (Updated Sept. 28, 2017, 12:20 a.m.)
>
>
> Review request for mesos, Jie Yu and Qian Zhang.
>
>
> Bugs: MESOS-7963
> https://issues.apache.org/jira/browse/MESOS-7963
>
>
> Repository: mesos
>
>
> Description
> -------
>
> When the MesosContainerizer destroys a container tree, we need to
> propagate the ContainerTermination down to all the child containers
> so that any executor that is waiting for them can receive enough
> information to send a useful status update.
>
>
> Diffs
> -----
>
> src/slave/containerizer/mesos/containerizer.hpp cc23b4d91be16fc95a131c09d07378b801e34d6f
> src/slave/containerizer/mesos/containerizer.cpp 4d5dc13f363f5d8886983d7dd06a5cecc177c345
>
>
> Diff: https://reviews.apache.org/r/62642/diff/2/
>
>
> Testing
> -------
>
> make check (Fedora 26)
>
>
> Thanks,
>
> James Peach
>
>
Re: Review Request 62642: Propagated the termination info down the
container tree.
Posted by Qian Zhang <zh...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62642/#review187527
-----------------------------------------------------------
Fix it, then Ship it!
src/slave/containerizer/mesos/containerizer.cpp
Lines 1006 (patched)
<https://reviews.apache.org/r/62642/#comment264536>
In the [this doc](https://github.com/apache/mesos/blob/master/docs/task-state-reasons.md#unused-reasons), I see:
> The reasons REASON_CONTAINER_LIMITATION, REASON_INVALID_FRAMEWORKID, REASON_SLAVE_UNKNOWN, REASON_TASK_UNKNOWN and REASON_EXECUTOR_UNREGISTERED are not used as of Mesos 1.4.
Since you have used `REASON_TASK_UNKNOWN` in the code here, I think we may need to remove it from the above statement, and change "Mesos 1.4" to "Mesos 1.5", and then put `REASON_TASK_UNKNOWN` into [this table](https://github.com/apache/mesos/blob/master/docs/task-state-reasons.md#for-state-task_failed).
- Qian Zhang
On Sept. 28, 2017, 8:20 a.m., James Peach wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62642/
> -----------------------------------------------------------
>
> (Updated Sept. 28, 2017, 8:20 a.m.)
>
>
> Review request for mesos, Jie Yu and Qian Zhang.
>
>
> Bugs: MESOS-7963
> https://issues.apache.org/jira/browse/MESOS-7963
>
>
> Repository: mesos
>
>
> Description
> -------
>
> When the MesosContainerizer destroys a container tree, we need to
> propagate the ContainerTermination down to all the child containers
> so that any executor that is waiting for them can receive enough
> information to send a useful status update.
>
>
> Diffs
> -----
>
> src/slave/containerizer/mesos/containerizer.hpp cc23b4d91be16fc95a131c09d07378b801e34d6f
> src/slave/containerizer/mesos/containerizer.cpp 4d5dc13f363f5d8886983d7dd06a5cecc177c345
>
>
> Diff: https://reviews.apache.org/r/62642/diff/2/
>
>
> Testing
> -------
>
> make check (Fedora 26)
>
>
> Thanks,
>
> James Peach
>
>