You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by James Peach <jp...@apache.org> on 2017/09/28 00:20:10 UTC

Review Request 62642: Propagated the termination info down the container tree.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62642/
-----------------------------------------------------------

Review request for mesos, Jie Yu and Qian Zhang.


Bugs: MESOS-7963
    https://issues.apache.org/jira/browse/MESOS-7963


Repository: mesos


Description
-------

When the MesosContainerizer destroys a container tree, we need to
propagate the ContainerTermination down to all the child containers
so that any executor that is waiting for them can receive enough
information to send a useful status update.


Diffs
-----

  src/slave/containerizer/mesos/containerizer.hpp cc23b4d91be16fc95a131c09d07378b801e34d6f 
  src/slave/containerizer/mesos/containerizer.cpp 4d5dc13f363f5d8886983d7dd06a5cecc177c345 


Diff: https://reviews.apache.org/r/62642/diff/1/


Testing
-------

make check (Fedora 26)


Thanks,

James Peach


Re: Review Request 62642: Propagated the termination info down the container tree.

Posted by James Peach <jp...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62642/
-----------------------------------------------------------

(Updated Oct. 10, 2017, 5:27 p.m.)


Review request for mesos, Jie Yu and Qian Zhang.


Changes
-------

Removed REASON_TASK_UNKNOWN usage.


Bugs: MESOS-7963
    https://issues.apache.org/jira/browse/MESOS-7963


Repository: mesos


Description
-------

When the MesosContainerizer destroys a container tree, we need to
propagate the ContainerTermination down to all the child containers
so that any executor that is waiting for them can receive enough
information to send a useful status update.


Diffs (updated)
-----

  src/slave/containerizer/mesos/containerizer.hpp cc23b4d91be16fc95a131c09d07378b801e34d6f 
  src/slave/containerizer/mesos/containerizer.cpp 4d5dc13f363f5d8886983d7dd06a5cecc177c345 


Diff: https://reviews.apache.org/r/62642/diff/3/

Changes: https://reviews.apache.org/r/62642/diff/2-3/


Testing
-------

make check (Fedora 26)


Thanks,

James Peach


Re: Review Request 62642: Propagated the termination info down the container tree.

Posted by Jie Yu <yu...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62642/#review186960
-----------------------------------------------------------


Ship it!




Ship It!

- Jie Yu


On Sept. 28, 2017, 12:20 a.m., James Peach wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62642/
> -----------------------------------------------------------
> 
> (Updated Sept. 28, 2017, 12:20 a.m.)
> 
> 
> Review request for mesos, Jie Yu and Qian Zhang.
> 
> 
> Bugs: MESOS-7963
>     https://issues.apache.org/jira/browse/MESOS-7963
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> When the MesosContainerizer destroys a container tree, we need to
> propagate the ContainerTermination down to all the child containers
> so that any executor that is waiting for them can receive enough
> information to send a useful status update.
> 
> 
> Diffs
> -----
> 
>   src/slave/containerizer/mesos/containerizer.hpp cc23b4d91be16fc95a131c09d07378b801e34d6f 
>   src/slave/containerizer/mesos/containerizer.cpp 4d5dc13f363f5d8886983d7dd06a5cecc177c345 
> 
> 
> Diff: https://reviews.apache.org/r/62642/diff/1/
> 
> 
> Testing
> -------
> 
> make check (Fedora 26)
> 
> 
> Thanks,
> 
> James Peach
> 
>


Re: Review Request 62642: Propagated the termination info down the container tree.

Posted by James Peach <jp...@apache.org>.

> On Oct. 10, 2017, 2:17 p.m., Qian Zhang wrote:
> > src/slave/containerizer/mesos/containerizer.cpp
> > Lines 1006 (patched)
> > <https://reviews.apache.org/r/62642/diff/2/?file=1850012#file1850012line1006>
> >
> >     In the [this doc](https://github.com/apache/mesos/blob/master/docs/task-state-reasons.md#unused-reasons), I see:
> >     > The reasons REASON_CONTAINER_LIMITATION, REASON_INVALID_FRAMEWORKID, REASON_SLAVE_UNKNOWN, REASON_TASK_UNKNOWN and REASON_EXECUTOR_UNREGISTERED are not used as of Mesos 1.4.
> >     
> >     Since you have used `REASON_TASK_UNKNOWN` in the code here, I think we may need to remove it from the above statement, and change "Mesos 1.4" to "Mesos 1.5", and then put `REASON_TASK_UNKNOWN` into [this table](https://github.com/apache/mesos/blob/master/docs/task-state-reasons.md#for-state-task_failed).

Since this is only used for orphans, I don't think it is possible to deliver `REASON_TASK_UNKNOWN` to schedulers. For a scheduler to receive the `REASON_TASK_UNKNOWN` they would have had to call `wait()` on an orphan container, which can't happen AFAICT. Maybe it would be clearer to just pass `None()` here.


- James


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62642/#review187527
-----------------------------------------------------------


On Sept. 28, 2017, 12:20 a.m., James Peach wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62642/
> -----------------------------------------------------------
> 
> (Updated Sept. 28, 2017, 12:20 a.m.)
> 
> 
> Review request for mesos, Jie Yu and Qian Zhang.
> 
> 
> Bugs: MESOS-7963
>     https://issues.apache.org/jira/browse/MESOS-7963
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> When the MesosContainerizer destroys a container tree, we need to
> propagate the ContainerTermination down to all the child containers
> so that any executor that is waiting for them can receive enough
> information to send a useful status update.
> 
> 
> Diffs
> -----
> 
>   src/slave/containerizer/mesos/containerizer.hpp cc23b4d91be16fc95a131c09d07378b801e34d6f 
>   src/slave/containerizer/mesos/containerizer.cpp 4d5dc13f363f5d8886983d7dd06a5cecc177c345 
> 
> 
> Diff: https://reviews.apache.org/r/62642/diff/2/
> 
> 
> Testing
> -------
> 
> make check (Fedora 26)
> 
> 
> Thanks,
> 
> James Peach
> 
>


Re: Review Request 62642: Propagated the termination info down the container tree.

Posted by Qian Zhang <zh...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62642/#review187527
-----------------------------------------------------------


Fix it, then Ship it!





src/slave/containerizer/mesos/containerizer.cpp
Lines 1006 (patched)
<https://reviews.apache.org/r/62642/#comment264536>

    In the [this doc](https://github.com/apache/mesos/blob/master/docs/task-state-reasons.md#unused-reasons), I see:
    > The reasons REASON_CONTAINER_LIMITATION, REASON_INVALID_FRAMEWORKID, REASON_SLAVE_UNKNOWN, REASON_TASK_UNKNOWN and REASON_EXECUTOR_UNREGISTERED are not used as of Mesos 1.4.
    
    Since you have used `REASON_TASK_UNKNOWN` in the code here, I think we may need to remove it from the above statement, and change "Mesos 1.4" to "Mesos 1.5", and then put `REASON_TASK_UNKNOWN` into [this table](https://github.com/apache/mesos/blob/master/docs/task-state-reasons.md#for-state-task_failed).


- Qian Zhang


On Sept. 28, 2017, 8:20 a.m., James Peach wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62642/
> -----------------------------------------------------------
> 
> (Updated Sept. 28, 2017, 8:20 a.m.)
> 
> 
> Review request for mesos, Jie Yu and Qian Zhang.
> 
> 
> Bugs: MESOS-7963
>     https://issues.apache.org/jira/browse/MESOS-7963
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> When the MesosContainerizer destroys a container tree, we need to
> propagate the ContainerTermination down to all the child containers
> so that any executor that is waiting for them can receive enough
> information to send a useful status update.
> 
> 
> Diffs
> -----
> 
>   src/slave/containerizer/mesos/containerizer.hpp cc23b4d91be16fc95a131c09d07378b801e34d6f 
>   src/slave/containerizer/mesos/containerizer.cpp 4d5dc13f363f5d8886983d7dd06a5cecc177c345 
> 
> 
> Diff: https://reviews.apache.org/r/62642/diff/2/
> 
> 
> Testing
> -------
> 
> make check (Fedora 26)
> 
> 
> Thanks,
> 
> James Peach
> 
>