You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Jie Yu <yu...@gmail.com> on 2015/09/25 04:14:55 UTC

Review Request 38746: Added TaskStatus::Reason to containerizer Termination message.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38746/
-----------------------------------------------------------

Review request for mesos, Ben Mahler, Timothy Chen, and Vinod Kone.


Bugs: MESOS-2035
    https://issues.apache.org/jira/browse/MESOS-2035


Repository: mesos


Description
-------

Added TaskStatus::Reason to containerizer Termination message.


Diffs
-----

  include/mesos/containerizer/containerizer.proto f16ccc89f83da28c413ccfa0687a06b7515a605c 
  include/mesos/mesos.proto 4a16be1f570769f3ce42a50a9da9f4fb1c227999 
  include/mesos/slave/isolator.proto 9d38a25470a79af1eda9122c037c82b8cbbad6ed 
  src/common/protobuf_utils.hpp 8793851fb927ab1326da6b6a424b3c6a75eb5001 
  src/common/protobuf_utils.cpp c1e8e011e6d0b8e1cd8a836e3168685ec401b21b 
  src/slave/containerizer/docker.cpp efa37266368ee12fea9134b35ebc5047a2820f94 
  src/slave/containerizer/isolators/cgroups/mem.cpp 89c86beb9227eb8a6e70a413e7b3934add652981 
  src/slave/containerizer/isolators/posix/disk.cpp c324c79f8d598095d07fbcb26e806a0978c2a520 
  src/slave/containerizer/mesos/containerizer.hpp 4c1419290645ad4c44360a81618a6cea7ad190df 
  src/slave/containerizer/mesos/containerizer.cpp b904b2d88e9b62fa4ba312c4569a4d89b0dc6052 
  src/slave/slave.cpp d1c9977feeb30ad43586a4560eed155865d27a6c 
  src/tests/containerizer/docker_containerizer_tests.cpp 8771ef661039310d79845513ea4602e15b2ad13d 
  src/tests/containerizer/mesos_containerizer_tests.cpp 5bc7d408bda0c249e1b66747d8bd87e688362e6c 
  src/tests/slave_recovery_tests.cpp fd285fbba0ec08fe2afbdd8da0b0b451126da0eb 
  src/tests/slave_tests.cpp dccdbb0ec5352df5da63d5ef7261bfc7fd599acb 

Diff: https://reviews.apache.org/r/38746/diff/


Testing
-------

sudo make check

@tnachen, I need you to help test this patch with docker containerizer.


Thanks,

Jie Yu


Re: Review Request 38746: Added TaskStatus::Reason to containerizer Termination message.

Posted by Ben Mahler <be...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38746/#review102969
-----------------------------------------------------------



include/mesos/mesos.proto (lines 1104 - 1111)
<https://reviews.apache.org/r/38746/#comment160803>

    I was curious if enums supported aliases at the time but I didn't dig in, apologies!
    
    Looks like we should use the enum aliases to preserve compatibility:
    
    https://developers.google.com/protocol-buffers/docs/proto?hl=en#enum


- Ben Mahler


On Oct. 12, 2015, 7:33 p.m., Jie Yu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/38746/
> -----------------------------------------------------------
> 
> (Updated Oct. 12, 2015, 7:33 p.m.)
> 
> 
> Review request for mesos, Ben Mahler, Timothy Chen, and Vinod Kone.
> 
> 
> Bugs: MESOS-2035
>     https://issues.apache.org/jira/browse/MESOS-2035
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Added TaskStatus::Reason to containerizer Termination message.
> 
> The following doc summarized the problem and proposed a solution:
> https://docs.google.com/document/d/1klGDAu5yBVf-CGWLqvELLIfxLfRaisGkhi6Gn7952-4/edit?usp=sharing
> 
> 
> Diffs
> -----
> 
>   include/mesos/containerizer/containerizer.proto f16ccc89f83da28c413ccfa0687a06b7515a605c 
>   include/mesos/mesos.proto 4a16be1f570769f3ce42a50a9da9f4fb1c227999 
>   include/mesos/slave/isolator.proto 9d38a25470a79af1eda9122c037c82b8cbbad6ed 
>   src/common/protobuf_utils.hpp 8793851fb927ab1326da6b6a424b3c6a75eb5001 
>   src/common/protobuf_utils.cpp c1e8e011e6d0b8e1cd8a836e3168685ec401b21b 
>   src/slave/containerizer/docker.cpp 6c975f904178e01797b67628a2d471ec7b3b1fbf 
>   src/slave/containerizer/external_containerizer.cpp 211649201777f0d2ce802a865090129eacdd53be 
>   src/slave/containerizer/isolators/cgroups/mem.cpp 89c86beb9227eb8a6e70a413e7b3934add652981 
>   src/slave/containerizer/isolators/posix/disk.cpp c324c79f8d598095d07fbcb26e806a0978c2a520 
>   src/slave/containerizer/mesos/containerizer.hpp 4c1419290645ad4c44360a81618a6cea7ad190df 
>   src/slave/containerizer/mesos/containerizer.cpp b904b2d88e9b62fa4ba312c4569a4d89b0dc6052 
>   src/slave/slave.hpp 18be4f8188ad34ef4d0aa4b5eba241053d071476 
>   src/slave/slave.cpp d1c9977feeb30ad43586a4560eed155865d27a6c 
>   src/tests/containerizer.cpp 1f743155526192569dd61a47ab67d8e58aab205a 
>   src/tests/containerizer/docker_containerizer_tests.cpp 8771ef661039310d79845513ea4602e15b2ad13d 
>   src/tests/containerizer/mesos_containerizer_tests.cpp 5bc7d408bda0c249e1b66747d8bd87e688362e6c 
>   src/tests/oversubscription_tests.cpp 2cf047e6ae061c1b9eed6238e1df67c91ab3514a 
>   src/tests/slave_recovery_tests.cpp fd285fbba0ec08fe2afbdd8da0b0b451126da0eb 
>   src/tests/slave_tests.cpp dccdbb0ec5352df5da63d5ef7261bfc7fd599acb 
> 
> Diff: https://reviews.apache.org/r/38746/diff/
> 
> 
> Testing
> -------
> 
> sudo make check
> 
> tested with Docker as well.
> 
> 
> Thanks,
> 
> Jie Yu
> 
>


Re: Review Request 38746: Added TaskStatus::Reason to containerizer Termination message.

Posted by Jie Yu <yu...@gmail.com>.

> On Oct. 12, 2015, 9:03 p.m., Ben Mahler wrote:
> > include/mesos/mesos.proto, lines 1100-1101
> > <https://reviews.apache.org/r/38746/diff/3/?file=1096299#file1096299line1100>
> >
> >     Shall we say why it's "bad"? i.e. the default value when a caller doesn't check for presence is 0 and so ideally the 0 reason is not a valid one.

Done. Thanks!


> On Oct. 12, 2015, 9:03 p.m., Ben Mahler wrote:
> > src/slave/slave.cpp, lines 1673-1695
> > <https://reviews.apache.org/r/38746/diff/3/?file=1096310#file1096310line1673>
> >
> >     It's still a little unsettling that these container update / launch failure cases don't properly set the executor state to Executor::TERMINATING, can you at least add a TODO in these to signal the reader of the code?
> >     
> >     Curious what we can do to simplify all of these redundant container update failure cases, worth thinking about later..

Yes, I'll follow up with a patch shortly. Working on it. I'll add a TODO for now.


> On Oct. 12, 2015, 9:03 p.m., Ben Mahler wrote:
> > include/mesos/mesos.proto, lines 1105-1107
> > <https://reviews.apache.org/r/38746/diff/3/?file=1096299#file1096299line1105>
> >
> >     Would be nice to have these renames in the upgrade notes, mentioning that the numbers are backwards compatible but there needs to be a compile time adjustment for those relying on REASON_MEMORY_LIMIT :)

Expecting a small patch shortly.


- Jie


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38746/#review102317
-----------------------------------------------------------


On Oct. 12, 2015, 7:33 p.m., Jie Yu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/38746/
> -----------------------------------------------------------
> 
> (Updated Oct. 12, 2015, 7:33 p.m.)
> 
> 
> Review request for mesos, Ben Mahler, Timothy Chen, and Vinod Kone.
> 
> 
> Bugs: MESOS-2035
>     https://issues.apache.org/jira/browse/MESOS-2035
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Added TaskStatus::Reason to containerizer Termination message.
> 
> The following doc summarized the problem and proposed a solution:
> https://docs.google.com/document/d/1klGDAu5yBVf-CGWLqvELLIfxLfRaisGkhi6Gn7952-4/edit?usp=sharing
> 
> 
> Diffs
> -----
> 
>   include/mesos/containerizer/containerizer.proto f16ccc89f83da28c413ccfa0687a06b7515a605c 
>   include/mesos/mesos.proto 4a16be1f570769f3ce42a50a9da9f4fb1c227999 
>   include/mesos/slave/isolator.proto 9d38a25470a79af1eda9122c037c82b8cbbad6ed 
>   src/common/protobuf_utils.hpp 8793851fb927ab1326da6b6a424b3c6a75eb5001 
>   src/common/protobuf_utils.cpp c1e8e011e6d0b8e1cd8a836e3168685ec401b21b 
>   src/slave/containerizer/docker.cpp 6c975f904178e01797b67628a2d471ec7b3b1fbf 
>   src/slave/containerizer/external_containerizer.cpp 211649201777f0d2ce802a865090129eacdd53be 
>   src/slave/containerizer/isolators/cgroups/mem.cpp 89c86beb9227eb8a6e70a413e7b3934add652981 
>   src/slave/containerizer/isolators/posix/disk.cpp c324c79f8d598095d07fbcb26e806a0978c2a520 
>   src/slave/containerizer/mesos/containerizer.hpp 4c1419290645ad4c44360a81618a6cea7ad190df 
>   src/slave/containerizer/mesos/containerizer.cpp b904b2d88e9b62fa4ba312c4569a4d89b0dc6052 
>   src/slave/slave.hpp 18be4f8188ad34ef4d0aa4b5eba241053d071476 
>   src/slave/slave.cpp d1c9977feeb30ad43586a4560eed155865d27a6c 
>   src/tests/containerizer.cpp 1f743155526192569dd61a47ab67d8e58aab205a 
>   src/tests/containerizer/docker_containerizer_tests.cpp 8771ef661039310d79845513ea4602e15b2ad13d 
>   src/tests/containerizer/mesos_containerizer_tests.cpp 5bc7d408bda0c249e1b66747d8bd87e688362e6c 
>   src/tests/oversubscription_tests.cpp 2cf047e6ae061c1b9eed6238e1df67c91ab3514a 
>   src/tests/slave_recovery_tests.cpp fd285fbba0ec08fe2afbdd8da0b0b451126da0eb 
>   src/tests/slave_tests.cpp dccdbb0ec5352df5da63d5ef7261bfc7fd599acb 
> 
> Diff: https://reviews.apache.org/r/38746/diff/
> 
> 
> Testing
> -------
> 
> sudo make check
> 
> tested with Docker as well.
> 
> 
> Thanks,
> 
> Jie Yu
> 
>


Re: Review Request 38746: Added TaskStatus::Reason to containerizer Termination message.

Posted by Ben Mahler <be...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38746/#review102317
-----------------------------------------------------------

Ship it!


Really nice to see this stuff get fixed, thanks Jie!


include/mesos/mesos.proto (lines 1100 - 1101)
<https://reviews.apache.org/r/38746/#comment159923>

    Shall we say why it's "bad"? i.e. the default value when a caller doesn't check for presence is 0 and so ideally the 0 reason is not a valid one.



include/mesos/mesos.proto (line 1102)
<https://reviews.apache.org/r/38746/#comment159922>

    Looks like tab characters were introduced here?



include/mesos/mesos.proto (lines 1105 - 1107)
<https://reviews.apache.org/r/38746/#comment159924>

    Would be nice to have these renames in the upgrade notes, mentioning that the numbers are backwards compatible but there needs to be a compile time adjustment for those relying on REASON_MEMORY_LIMIT :)



src/slave/slave.cpp (lines 1673 - 1695)
<https://reviews.apache.org/r/38746/#comment159926>

    It's still a little unsettling that these container update / launch failure cases don't properly set the executor state to Executor::TERMINATING, can you at least add a TODO in these to signal the reader of the code?
    
    Curious what we can do to simplify all of these redundant container update failure cases, worth thinking about later..



src/slave/slave.cpp (lines 4518 - 4519)
<https://reviews.apache.org/r/38746/#comment159927>

    Is this still needed? Looks like you've already done this?


- Ben Mahler


On Oct. 12, 2015, 7:33 p.m., Jie Yu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/38746/
> -----------------------------------------------------------
> 
> (Updated Oct. 12, 2015, 7:33 p.m.)
> 
> 
> Review request for mesos, Ben Mahler, Timothy Chen, and Vinod Kone.
> 
> 
> Bugs: MESOS-2035
>     https://issues.apache.org/jira/browse/MESOS-2035
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Added TaskStatus::Reason to containerizer Termination message.
> 
> The following doc summarized the problem and proposed a solution:
> https://docs.google.com/document/d/1klGDAu5yBVf-CGWLqvELLIfxLfRaisGkhi6Gn7952-4/edit?usp=sharing
> 
> 
> Diffs
> -----
> 
>   include/mesos/containerizer/containerizer.proto f16ccc89f83da28c413ccfa0687a06b7515a605c 
>   include/mesos/mesos.proto 4a16be1f570769f3ce42a50a9da9f4fb1c227999 
>   include/mesos/slave/isolator.proto 9d38a25470a79af1eda9122c037c82b8cbbad6ed 
>   src/common/protobuf_utils.hpp 8793851fb927ab1326da6b6a424b3c6a75eb5001 
>   src/common/protobuf_utils.cpp c1e8e011e6d0b8e1cd8a836e3168685ec401b21b 
>   src/slave/containerizer/docker.cpp 6c975f904178e01797b67628a2d471ec7b3b1fbf 
>   src/slave/containerizer/external_containerizer.cpp 211649201777f0d2ce802a865090129eacdd53be 
>   src/slave/containerizer/isolators/cgroups/mem.cpp 89c86beb9227eb8a6e70a413e7b3934add652981 
>   src/slave/containerizer/isolators/posix/disk.cpp c324c79f8d598095d07fbcb26e806a0978c2a520 
>   src/slave/containerizer/mesos/containerizer.hpp 4c1419290645ad4c44360a81618a6cea7ad190df 
>   src/slave/containerizer/mesos/containerizer.cpp b904b2d88e9b62fa4ba312c4569a4d89b0dc6052 
>   src/slave/slave.hpp 18be4f8188ad34ef4d0aa4b5eba241053d071476 
>   src/slave/slave.cpp d1c9977feeb30ad43586a4560eed155865d27a6c 
>   src/tests/containerizer.cpp 1f743155526192569dd61a47ab67d8e58aab205a 
>   src/tests/containerizer/docker_containerizer_tests.cpp 8771ef661039310d79845513ea4602e15b2ad13d 
>   src/tests/containerizer/mesos_containerizer_tests.cpp 5bc7d408bda0c249e1b66747d8bd87e688362e6c 
>   src/tests/oversubscription_tests.cpp 2cf047e6ae061c1b9eed6238e1df67c91ab3514a 
>   src/tests/slave_recovery_tests.cpp fd285fbba0ec08fe2afbdd8da0b0b451126da0eb 
>   src/tests/slave_tests.cpp dccdbb0ec5352df5da63d5ef7261bfc7fd599acb 
> 
> Diff: https://reviews.apache.org/r/38746/diff/
> 
> 
> Testing
> -------
> 
> sudo make check
> 
> tested with Docker as well.
> 
> 
> Thanks,
> 
> Jie Yu
> 
>


Re: Review Request 38746: Added TaskStatus::Reason to containerizer Termination message.

Posted by Mesos ReviewBot <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38746/#review102312
-----------------------------------------------------------


Patch looks great!

Reviews applied: [38746]

All tests passed.

- Mesos ReviewBot


On Oct. 12, 2015, 7:33 p.m., Jie Yu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/38746/
> -----------------------------------------------------------
> 
> (Updated Oct. 12, 2015, 7:33 p.m.)
> 
> 
> Review request for mesos, Ben Mahler, Timothy Chen, and Vinod Kone.
> 
> 
> Bugs: MESOS-2035
>     https://issues.apache.org/jira/browse/MESOS-2035
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Added TaskStatus::Reason to containerizer Termination message.
> 
> The following doc summarized the problem and proposed a solution:
> https://docs.google.com/document/d/1klGDAu5yBVf-CGWLqvELLIfxLfRaisGkhi6Gn7952-4/edit?usp=sharing
> 
> 
> Diffs
> -----
> 
>   include/mesos/containerizer/containerizer.proto f16ccc89f83da28c413ccfa0687a06b7515a605c 
>   include/mesos/mesos.proto 4a16be1f570769f3ce42a50a9da9f4fb1c227999 
>   include/mesos/slave/isolator.proto 9d38a25470a79af1eda9122c037c82b8cbbad6ed 
>   src/common/protobuf_utils.hpp 8793851fb927ab1326da6b6a424b3c6a75eb5001 
>   src/common/protobuf_utils.cpp c1e8e011e6d0b8e1cd8a836e3168685ec401b21b 
>   src/slave/containerizer/docker.cpp 6c975f904178e01797b67628a2d471ec7b3b1fbf 
>   src/slave/containerizer/external_containerizer.cpp 211649201777f0d2ce802a865090129eacdd53be 
>   src/slave/containerizer/isolators/cgroups/mem.cpp 89c86beb9227eb8a6e70a413e7b3934add652981 
>   src/slave/containerizer/isolators/posix/disk.cpp c324c79f8d598095d07fbcb26e806a0978c2a520 
>   src/slave/containerizer/mesos/containerizer.hpp 4c1419290645ad4c44360a81618a6cea7ad190df 
>   src/slave/containerizer/mesos/containerizer.cpp b904b2d88e9b62fa4ba312c4569a4d89b0dc6052 
>   src/slave/slave.hpp 18be4f8188ad34ef4d0aa4b5eba241053d071476 
>   src/slave/slave.cpp d1c9977feeb30ad43586a4560eed155865d27a6c 
>   src/tests/containerizer.cpp 1f743155526192569dd61a47ab67d8e58aab205a 
>   src/tests/containerizer/docker_containerizer_tests.cpp 8771ef661039310d79845513ea4602e15b2ad13d 
>   src/tests/containerizer/mesos_containerizer_tests.cpp 5bc7d408bda0c249e1b66747d8bd87e688362e6c 
>   src/tests/oversubscription_tests.cpp 2cf047e6ae061c1b9eed6238e1df67c91ab3514a 
>   src/tests/slave_recovery_tests.cpp fd285fbba0ec08fe2afbdd8da0b0b451126da0eb 
>   src/tests/slave_tests.cpp dccdbb0ec5352df5da63d5ef7261bfc7fd599acb 
> 
> Diff: https://reviews.apache.org/r/38746/diff/
> 
> 
> Testing
> -------
> 
> sudo make check
> 
> tested with Docker as well.
> 
> 
> Thanks,
> 
> Jie Yu
> 
>


Re: Review Request 38746: Added TaskStatus::Reason to containerizer Termination message.

Posted by Jie Yu <yu...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38746/
-----------------------------------------------------------

(Updated Oct. 12, 2015, 7:33 p.m.)


Review request for mesos, Ben Mahler, Timothy Chen, and Vinod Kone.


Changes
-------

Addressed review comments.


Bugs: MESOS-2035
    https://issues.apache.org/jira/browse/MESOS-2035


Repository: mesos


Description
-------

Added TaskStatus::Reason to containerizer Termination message.

The following doc summarized the problem and proposed a solution:
https://docs.google.com/document/d/1klGDAu5yBVf-CGWLqvELLIfxLfRaisGkhi6Gn7952-4/edit?usp=sharing


Diffs (updated)
-----

  include/mesos/containerizer/containerizer.proto f16ccc89f83da28c413ccfa0687a06b7515a605c 
  include/mesos/mesos.proto 4a16be1f570769f3ce42a50a9da9f4fb1c227999 
  include/mesos/slave/isolator.proto 9d38a25470a79af1eda9122c037c82b8cbbad6ed 
  src/common/protobuf_utils.hpp 8793851fb927ab1326da6b6a424b3c6a75eb5001 
  src/common/protobuf_utils.cpp c1e8e011e6d0b8e1cd8a836e3168685ec401b21b 
  src/slave/containerizer/docker.cpp 6c975f904178e01797b67628a2d471ec7b3b1fbf 
  src/slave/containerizer/external_containerizer.cpp 211649201777f0d2ce802a865090129eacdd53be 
  src/slave/containerizer/isolators/cgroups/mem.cpp 89c86beb9227eb8a6e70a413e7b3934add652981 
  src/slave/containerizer/isolators/posix/disk.cpp c324c79f8d598095d07fbcb26e806a0978c2a520 
  src/slave/containerizer/mesos/containerizer.hpp 4c1419290645ad4c44360a81618a6cea7ad190df 
  src/slave/containerizer/mesos/containerizer.cpp b904b2d88e9b62fa4ba312c4569a4d89b0dc6052 
  src/slave/slave.hpp 18be4f8188ad34ef4d0aa4b5eba241053d071476 
  src/slave/slave.cpp d1c9977feeb30ad43586a4560eed155865d27a6c 
  src/tests/containerizer.cpp 1f743155526192569dd61a47ab67d8e58aab205a 
  src/tests/containerizer/docker_containerizer_tests.cpp 8771ef661039310d79845513ea4602e15b2ad13d 
  src/tests/containerizer/mesos_containerizer_tests.cpp 5bc7d408bda0c249e1b66747d8bd87e688362e6c 
  src/tests/oversubscription_tests.cpp 2cf047e6ae061c1b9eed6238e1df67c91ab3514a 
  src/tests/slave_recovery_tests.cpp fd285fbba0ec08fe2afbdd8da0b0b451126da0eb 
  src/tests/slave_tests.cpp dccdbb0ec5352df5da63d5ef7261bfc7fd599acb 

Diff: https://reviews.apache.org/r/38746/diff/


Testing
-------

sudo make check

tested with Docker as well.


Thanks,

Jie Yu


Re: Review Request 38746: Added TaskStatus::Reason to containerizer Termination message.

Posted by Mesos ReviewBot <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38746/#review102036
-----------------------------------------------------------


Patch looks great!

Reviews applied: [38746]

All tests passed.

- Mesos ReviewBot


On Oct. 9, 2015, 1:28 a.m., Jie Yu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/38746/
> -----------------------------------------------------------
> 
> (Updated Oct. 9, 2015, 1:28 a.m.)
> 
> 
> Review request for mesos, Ben Mahler, Timothy Chen, and Vinod Kone.
> 
> 
> Bugs: MESOS-2035
>     https://issues.apache.org/jira/browse/MESOS-2035
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Added TaskStatus::Reason to containerizer Termination message.
> 
> The following doc summarized the problem and proposed a solution:
> https://docs.google.com/document/d/1klGDAu5yBVf-CGWLqvELLIfxLfRaisGkhi6Gn7952-4/edit?usp=sharing
> 
> 
> Diffs
> -----
> 
>   include/mesos/containerizer/containerizer.proto f16ccc89f83da28c413ccfa0687a06b7515a605c 
>   include/mesos/mesos.proto 4a16be1f570769f3ce42a50a9da9f4fb1c227999 
>   include/mesos/slave/isolator.proto 9d38a25470a79af1eda9122c037c82b8cbbad6ed 
>   src/common/protobuf_utils.hpp 8793851fb927ab1326da6b6a424b3c6a75eb5001 
>   src/common/protobuf_utils.cpp c1e8e011e6d0b8e1cd8a836e3168685ec401b21b 
>   src/slave/containerizer/docker.cpp 6c975f904178e01797b67628a2d471ec7b3b1fbf 
>   src/slave/containerizer/external_containerizer.cpp 211649201777f0d2ce802a865090129eacdd53be 
>   src/slave/containerizer/isolators/cgroups/mem.cpp 89c86beb9227eb8a6e70a413e7b3934add652981 
>   src/slave/containerizer/isolators/posix/disk.cpp c324c79f8d598095d07fbcb26e806a0978c2a520 
>   src/slave/containerizer/mesos/containerizer.hpp 4c1419290645ad4c44360a81618a6cea7ad190df 
>   src/slave/containerizer/mesos/containerizer.cpp b904b2d88e9b62fa4ba312c4569a4d89b0dc6052 
>   src/slave/slave.hpp 18be4f8188ad34ef4d0aa4b5eba241053d071476 
>   src/slave/slave.cpp d1c9977feeb30ad43586a4560eed155865d27a6c 
>   src/tests/containerizer.cpp 1f743155526192569dd61a47ab67d8e58aab205a 
>   src/tests/containerizer/docker_containerizer_tests.cpp 8771ef661039310d79845513ea4602e15b2ad13d 
>   src/tests/containerizer/mesos_containerizer_tests.cpp 5bc7d408bda0c249e1b66747d8bd87e688362e6c 
>   src/tests/slave_recovery_tests.cpp fd285fbba0ec08fe2afbdd8da0b0b451126da0eb 
>   src/tests/slave_tests.cpp dccdbb0ec5352df5da63d5ef7261bfc7fd599acb 
> 
> Diff: https://reviews.apache.org/r/38746/diff/
> 
> 
> Testing
> -------
> 
> sudo make check
> 
> tested with Docker as well.
> 
> 
> Thanks,
> 
> Jie Yu
> 
>


Re: Review Request 38746: Added TaskStatus::Reason to containerizer Termination message.

Posted by Jie Yu <yu...@gmail.com>.

> On Oct. 9, 2015, 4:52 a.m., Timothy Chen wrote:
> > include/mesos/mesos.proto, line 1121
> > <https://reviews.apache.org/r/38746/diff/2/?file=1093435#file1093435line1121>
> >
> >     What's the reasoning behind this order?

Reordered according to alphabet.


- Jie


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38746/#review102006
-----------------------------------------------------------


On Oct. 9, 2015, 1:28 a.m., Jie Yu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/38746/
> -----------------------------------------------------------
> 
> (Updated Oct. 9, 2015, 1:28 a.m.)
> 
> 
> Review request for mesos, Ben Mahler, Timothy Chen, and Vinod Kone.
> 
> 
> Bugs: MESOS-2035
>     https://issues.apache.org/jira/browse/MESOS-2035
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Added TaskStatus::Reason to containerizer Termination message.
> 
> The following doc summarized the problem and proposed a solution:
> https://docs.google.com/document/d/1klGDAu5yBVf-CGWLqvELLIfxLfRaisGkhi6Gn7952-4/edit?usp=sharing
> 
> 
> Diffs
> -----
> 
>   include/mesos/containerizer/containerizer.proto f16ccc89f83da28c413ccfa0687a06b7515a605c 
>   include/mesos/mesos.proto 4a16be1f570769f3ce42a50a9da9f4fb1c227999 
>   include/mesos/slave/isolator.proto 9d38a25470a79af1eda9122c037c82b8cbbad6ed 
>   src/common/protobuf_utils.hpp 8793851fb927ab1326da6b6a424b3c6a75eb5001 
>   src/common/protobuf_utils.cpp c1e8e011e6d0b8e1cd8a836e3168685ec401b21b 
>   src/slave/containerizer/docker.cpp 6c975f904178e01797b67628a2d471ec7b3b1fbf 
>   src/slave/containerizer/external_containerizer.cpp 211649201777f0d2ce802a865090129eacdd53be 
>   src/slave/containerizer/isolators/cgroups/mem.cpp 89c86beb9227eb8a6e70a413e7b3934add652981 
>   src/slave/containerizer/isolators/posix/disk.cpp c324c79f8d598095d07fbcb26e806a0978c2a520 
>   src/slave/containerizer/mesos/containerizer.hpp 4c1419290645ad4c44360a81618a6cea7ad190df 
>   src/slave/containerizer/mesos/containerizer.cpp b904b2d88e9b62fa4ba312c4569a4d89b0dc6052 
>   src/slave/slave.hpp 18be4f8188ad34ef4d0aa4b5eba241053d071476 
>   src/slave/slave.cpp d1c9977feeb30ad43586a4560eed155865d27a6c 
>   src/tests/containerizer.cpp 1f743155526192569dd61a47ab67d8e58aab205a 
>   src/tests/containerizer/docker_containerizer_tests.cpp 8771ef661039310d79845513ea4602e15b2ad13d 
>   src/tests/containerizer/mesos_containerizer_tests.cpp 5bc7d408bda0c249e1b66747d8bd87e688362e6c 
>   src/tests/slave_recovery_tests.cpp fd285fbba0ec08fe2afbdd8da0b0b451126da0eb 
>   src/tests/slave_tests.cpp dccdbb0ec5352df5da63d5ef7261bfc7fd599acb 
> 
> Diff: https://reviews.apache.org/r/38746/diff/
> 
> 
> Testing
> -------
> 
> sudo make check
> 
> tested with Docker as well.
> 
> 
> Thanks,
> 
> Jie Yu
> 
>


Re: Review Request 38746: Added TaskStatus::Reason to containerizer Termination message.

Posted by Timothy Chen <tn...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38746/#review102006
-----------------------------------------------------------



include/mesos/mesos.proto (line 1119)
<https://reviews.apache.org/r/38746/#comment159530>

    What's the reasoning behind this order?



src/slave/slave.cpp (line 4509)
<https://reviews.apache.org/r/38746/#comment159531>

    Should we say Executor preempted here?
    
    And also I wonder if we should stick with the Qos correction terminology here, e.g: Container preempted by QoS correction?


- Timothy Chen


On Oct. 9, 2015, 1:28 a.m., Jie Yu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/38746/
> -----------------------------------------------------------
> 
> (Updated Oct. 9, 2015, 1:28 a.m.)
> 
> 
> Review request for mesos, Ben Mahler, Timothy Chen, and Vinod Kone.
> 
> 
> Bugs: MESOS-2035
>     https://issues.apache.org/jira/browse/MESOS-2035
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Added TaskStatus::Reason to containerizer Termination message.
> 
> The following doc summarized the problem and proposed a solution:
> https://docs.google.com/document/d/1klGDAu5yBVf-CGWLqvELLIfxLfRaisGkhi6Gn7952-4/edit?usp=sharing
> 
> 
> Diffs
> -----
> 
>   include/mesos/containerizer/containerizer.proto f16ccc89f83da28c413ccfa0687a06b7515a605c 
>   include/mesos/mesos.proto 4a16be1f570769f3ce42a50a9da9f4fb1c227999 
>   include/mesos/slave/isolator.proto 9d38a25470a79af1eda9122c037c82b8cbbad6ed 
>   src/common/protobuf_utils.hpp 8793851fb927ab1326da6b6a424b3c6a75eb5001 
>   src/common/protobuf_utils.cpp c1e8e011e6d0b8e1cd8a836e3168685ec401b21b 
>   src/slave/containerizer/docker.cpp 6c975f904178e01797b67628a2d471ec7b3b1fbf 
>   src/slave/containerizer/external_containerizer.cpp 211649201777f0d2ce802a865090129eacdd53be 
>   src/slave/containerizer/isolators/cgroups/mem.cpp 89c86beb9227eb8a6e70a413e7b3934add652981 
>   src/slave/containerizer/isolators/posix/disk.cpp c324c79f8d598095d07fbcb26e806a0978c2a520 
>   src/slave/containerizer/mesos/containerizer.hpp 4c1419290645ad4c44360a81618a6cea7ad190df 
>   src/slave/containerizer/mesos/containerizer.cpp b904b2d88e9b62fa4ba312c4569a4d89b0dc6052 
>   src/slave/slave.hpp 18be4f8188ad34ef4d0aa4b5eba241053d071476 
>   src/slave/slave.cpp d1c9977feeb30ad43586a4560eed155865d27a6c 
>   src/tests/containerizer.cpp 1f743155526192569dd61a47ab67d8e58aab205a 
>   src/tests/containerizer/docker_containerizer_tests.cpp 8771ef661039310d79845513ea4602e15b2ad13d 
>   src/tests/containerizer/mesos_containerizer_tests.cpp 5bc7d408bda0c249e1b66747d8bd87e688362e6c 
>   src/tests/slave_recovery_tests.cpp fd285fbba0ec08fe2afbdd8da0b0b451126da0eb 
>   src/tests/slave_tests.cpp dccdbb0ec5352df5da63d5ef7261bfc7fd599acb 
> 
> Diff: https://reviews.apache.org/r/38746/diff/
> 
> 
> Testing
> -------
> 
> sudo make check
> 
> tested with Docker as well.
> 
> 
> Thanks,
> 
> Jie Yu
> 
>


Re: Review Request 38746: Added TaskStatus::Reason to containerizer Termination message.

Posted by Jie Yu <yu...@gmail.com>.

> On Oct. 9, 2015, 2:06 a.m., Kapil Arya wrote:
> > src/slave/containerizer/mesos/containerizer.cpp, lines 1109-1115
> > <https://reviews.apache.org/r/38746/diff/2/?file=1093444#file1093444line1109>
> >
> >     I am slightly confused here. If one or more `Isolator::prepare` calls returned a failure, where are we accumulating the failure messages?

No, the prepare of each isolator is serialized. So the error message will be from the first failed isolator->prepare.


- Jie


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38746/#review101996
-----------------------------------------------------------


On Oct. 9, 2015, 1:28 a.m., Jie Yu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/38746/
> -----------------------------------------------------------
> 
> (Updated Oct. 9, 2015, 1:28 a.m.)
> 
> 
> Review request for mesos, Ben Mahler, Timothy Chen, and Vinod Kone.
> 
> 
> Bugs: MESOS-2035
>     https://issues.apache.org/jira/browse/MESOS-2035
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Added TaskStatus::Reason to containerizer Termination message.
> 
> The following doc summarized the problem and proposed a solution:
> https://docs.google.com/document/d/1klGDAu5yBVf-CGWLqvELLIfxLfRaisGkhi6Gn7952-4/edit?usp=sharing
> 
> 
> Diffs
> -----
> 
>   include/mesos/containerizer/containerizer.proto f16ccc89f83da28c413ccfa0687a06b7515a605c 
>   include/mesos/mesos.proto 4a16be1f570769f3ce42a50a9da9f4fb1c227999 
>   include/mesos/slave/isolator.proto 9d38a25470a79af1eda9122c037c82b8cbbad6ed 
>   src/common/protobuf_utils.hpp 8793851fb927ab1326da6b6a424b3c6a75eb5001 
>   src/common/protobuf_utils.cpp c1e8e011e6d0b8e1cd8a836e3168685ec401b21b 
>   src/slave/containerizer/docker.cpp 6c975f904178e01797b67628a2d471ec7b3b1fbf 
>   src/slave/containerizer/external_containerizer.cpp 211649201777f0d2ce802a865090129eacdd53be 
>   src/slave/containerizer/isolators/cgroups/mem.cpp 89c86beb9227eb8a6e70a413e7b3934add652981 
>   src/slave/containerizer/isolators/posix/disk.cpp c324c79f8d598095d07fbcb26e806a0978c2a520 
>   src/slave/containerizer/mesos/containerizer.hpp 4c1419290645ad4c44360a81618a6cea7ad190df 
>   src/slave/containerizer/mesos/containerizer.cpp b904b2d88e9b62fa4ba312c4569a4d89b0dc6052 
>   src/slave/slave.hpp 18be4f8188ad34ef4d0aa4b5eba241053d071476 
>   src/slave/slave.cpp d1c9977feeb30ad43586a4560eed155865d27a6c 
>   src/tests/containerizer.cpp 1f743155526192569dd61a47ab67d8e58aab205a 
>   src/tests/containerizer/docker_containerizer_tests.cpp 8771ef661039310d79845513ea4602e15b2ad13d 
>   src/tests/containerizer/mesos_containerizer_tests.cpp 5bc7d408bda0c249e1b66747d8bd87e688362e6c 
>   src/tests/slave_recovery_tests.cpp fd285fbba0ec08fe2afbdd8da0b0b451126da0eb 
>   src/tests/slave_tests.cpp dccdbb0ec5352df5da63d5ef7261bfc7fd599acb 
> 
> Diff: https://reviews.apache.org/r/38746/diff/
> 
> 
> Testing
> -------
> 
> sudo make check
> 
> tested with Docker as well.
> 
> 
> Thanks,
> 
> Jie Yu
> 
>


Re: Review Request 38746: Added TaskStatus::Reason to containerizer Termination message.

Posted by Kapil Arya <ka...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38746/#review101996
-----------------------------------------------------------


Thanks for doing this, Jie! I have one nit and one question about `Isolator::prepare` failure messages below.


src/slave/containerizer/mesos/containerizer.cpp (lines 1107 - 1113)
<https://reviews.apache.org/r/38746/#comment159520>

    I am slightly confused here. If one or more `Isolator::prepare` calls returned a failure, where are we accumulating the failure messages?



src/slave/slave.hpp (line 303)
<https://reviews.apache.org/r/38746/#comment159519>

    s/routine/routines/


- Kapil Arya


On Oct. 8, 2015, 9:28 p.m., Jie Yu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/38746/
> -----------------------------------------------------------
> 
> (Updated Oct. 8, 2015, 9:28 p.m.)
> 
> 
> Review request for mesos, Ben Mahler, Timothy Chen, and Vinod Kone.
> 
> 
> Bugs: MESOS-2035
>     https://issues.apache.org/jira/browse/MESOS-2035
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Added TaskStatus::Reason to containerizer Termination message.
> 
> The following doc summarized the problem and proposed a solution:
> https://docs.google.com/document/d/1klGDAu5yBVf-CGWLqvELLIfxLfRaisGkhi6Gn7952-4/edit?usp=sharing
> 
> 
> Diffs
> -----
> 
>   include/mesos/containerizer/containerizer.proto f16ccc89f83da28c413ccfa0687a06b7515a605c 
>   include/mesos/mesos.proto 4a16be1f570769f3ce42a50a9da9f4fb1c227999 
>   include/mesos/slave/isolator.proto 9d38a25470a79af1eda9122c037c82b8cbbad6ed 
>   src/common/protobuf_utils.hpp 8793851fb927ab1326da6b6a424b3c6a75eb5001 
>   src/common/protobuf_utils.cpp c1e8e011e6d0b8e1cd8a836e3168685ec401b21b 
>   src/slave/containerizer/docker.cpp 6c975f904178e01797b67628a2d471ec7b3b1fbf 
>   src/slave/containerizer/external_containerizer.cpp 211649201777f0d2ce802a865090129eacdd53be 
>   src/slave/containerizer/isolators/cgroups/mem.cpp 89c86beb9227eb8a6e70a413e7b3934add652981 
>   src/slave/containerizer/isolators/posix/disk.cpp c324c79f8d598095d07fbcb26e806a0978c2a520 
>   src/slave/containerizer/mesos/containerizer.hpp 4c1419290645ad4c44360a81618a6cea7ad190df 
>   src/slave/containerizer/mesos/containerizer.cpp b904b2d88e9b62fa4ba312c4569a4d89b0dc6052 
>   src/slave/slave.hpp 18be4f8188ad34ef4d0aa4b5eba241053d071476 
>   src/slave/slave.cpp d1c9977feeb30ad43586a4560eed155865d27a6c 
>   src/tests/containerizer.cpp 1f743155526192569dd61a47ab67d8e58aab205a 
>   src/tests/containerizer/docker_containerizer_tests.cpp 8771ef661039310d79845513ea4602e15b2ad13d 
>   src/tests/containerizer/mesos_containerizer_tests.cpp 5bc7d408bda0c249e1b66747d8bd87e688362e6c 
>   src/tests/slave_recovery_tests.cpp fd285fbba0ec08fe2afbdd8da0b0b451126da0eb 
>   src/tests/slave_tests.cpp dccdbb0ec5352df5da63d5ef7261bfc7fd599acb 
> 
> Diff: https://reviews.apache.org/r/38746/diff/
> 
> 
> Testing
> -------
> 
> sudo make check
> 
> tested with Docker as well.
> 
> 
> Thanks,
> 
> Jie Yu
> 
>


Re: Review Request 38746: Added TaskStatus::Reason to containerizer Termination message.

Posted by Jie Yu <yu...@gmail.com>.

> On Oct. 9, 2015, 10:54 p.m., Ben Mahler wrote:
> > include/mesos/containerizer/containerizer.proto, lines 95-99
> > <https://reviews.apache.org/r/38746/diff/2/?file=1093434#file1093434line95>
> >
> >     Why do you say "executor" here but termination is about "container" above?
> >     
> >     Also, can you:
> >     
> >     s/task_state/state/
> >     s/task_status_reasons/reasons/
> >     
> >     s/'reason'/'reasons'/
> >     s/of a status update for those pending/unterminated tasks/of status updates for non-terminal tasks/

Done.


> On Oct. 9, 2015, 10:54 p.m., Ben Mahler wrote:
> > include/mesos/mesos.proto, line 1100
> > <https://reviews.apache.org/r/38746/diff/2/?file=1093435#file1093435line1100>
> >
> >     0 should have been UNKNOWN, ugh :(

Added a TODO to remove it. This enum is not used anymore.


> On Oct. 9, 2015, 10:54 p.m., Ben Mahler wrote:
> > include/mesos/mesos.proto, lines 1121-1124
> > <https://reviews.apache.org/r/38746/diff/2/?file=1093435#file1093435line1121>
> >
> >     Can you keep these alphabetical?
> >     
> >     It's a bit unfortunate that we don't have a common "REASON_CONTAINER_LIMITATION" prefix for the limitations. Lesson learned for the next time we decide to name enums to use use prefixing to group related enums...
> >     
> >     Do you want to add a generic limitation? I believe it's ok to rename these from a conversation with vinod, given they were intended for metrics rather than control flow in schedulers:
> >     
> >     REASON_CONTAINER_LIMITATION
> >     REASON_CONTAINER_LIMITATION_DISK
> >     REASON_CONTAINER_LIMITATION_MEMORY
> >     
> >     Note the presence of a general limitation that custom isolators can use.

Done.


> On Oct. 9, 2015, 10:54 p.m., Ben Mahler wrote:
> > include/mesos/slave/isolator.proto, line 42
> > <https://reviews.apache.org/r/38746/diff/2/?file=1093436#file1093436line42>
> >
> >     How about: s/for those tasks that are in non-terminal status when the container is terminated/for any remaining non-terminal tasks/ ?

Done.


> On Oct. 9, 2015, 10:54 p.m., Ben Mahler wrote:
> > include/mesos/slave/isolator.proto, line 44
> > <https://reviews.apache.org/r/38746/diff/2/?file=1093436#file1093436line44>
> >
> >     Why not s/task_status_reason/reason/ ?

Done.


> On Oct. 9, 2015, 10:54 p.m., Ben Mahler wrote:
> > src/slave/containerizer/mesos/containerizer.cpp, lines 1244-1246
> > <https://reviews.apache.org/r/38746/diff/2/?file=1093444#file1093444line1244>
> >
> >     Can you use the arrow operator here?
> >     
> >     status->isSome
> >     status->get

Yes. Thanks!


> On Oct. 9, 2015, 10:54 p.m., Ben Mahler wrote:
> > src/slave/containerizer/mesos/containerizer.cpp, line 1252
> > <https://reviews.apache.org/r/38746/diff/2/?file=1093444#file1093444line1252>
> >
> >     ! empty() ?

Done.


> On Oct. 9, 2015, 10:54 p.m., Ben Mahler wrote:
> > src/slave/containerizer/mesos/containerizer.cpp, line 1269
> > <https://reviews.apache.org/r/38746/diff/2/?file=1093444#file1093444line1269>
> >
> >     Why did you need the trim here?

Removed.


> On Oct. 9, 2015, 10:54 p.m., Ben Mahler wrote:
> > src/slave/slave.hpp, lines 306-307
> > <https://reviews.apache.org/r/38746/diff/2/?file=1093445#file1093445line306>
> >
> >     Could you wrap on the next line, as is done with getExecutorInfo?

Done.


> On Oct. 9, 2015, 10:54 p.m., Ben Mahler wrote:
> > src/slave/slave.hpp, lines 629-636
> > <https://reviews.apache.org/r/38746/diff/2/?file=1093445#file1093445line629>
> >
> >     Hm.. I think people mind get a bit confused when they encounter this, how about we add a comment to guide them and rename it like the following:
> >     
> >     ```
> >     // When the agent initiates a destroyal of the container,
> >     // we expect a termination to occur. The 'pendingTermation'
> >     // indicates why the agent initiated the destruction and
> >     // will influence the information sent in the status updates
> >     // for any remaining non-terminal tasks.
> >     Option<Termination> pendingTermination;
> >     ```

Done.


> On Oct. 9, 2015, 10:54 p.m., Ben Mahler wrote:
> > src/slave/slave.cpp, line 4725
> > <https://reviews.apache.org/r/38746/diff/2/?file=1093446#file1093446line4725>
> >
> >     How about:
> >     
> >     ```
> >     (!termination->reasons().empty)
> >     ```
> >     
> >     instead of the size check

Looks like the protobuf we bundle does not have this method defined. It's introduced in later versions.


- Jie


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38746/#review102111
-----------------------------------------------------------


On Oct. 9, 2015, 1:28 a.m., Jie Yu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/38746/
> -----------------------------------------------------------
> 
> (Updated Oct. 9, 2015, 1:28 a.m.)
> 
> 
> Review request for mesos, Ben Mahler, Timothy Chen, and Vinod Kone.
> 
> 
> Bugs: MESOS-2035
>     https://issues.apache.org/jira/browse/MESOS-2035
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Added TaskStatus::Reason to containerizer Termination message.
> 
> The following doc summarized the problem and proposed a solution:
> https://docs.google.com/document/d/1klGDAu5yBVf-CGWLqvELLIfxLfRaisGkhi6Gn7952-4/edit?usp=sharing
> 
> 
> Diffs
> -----
> 
>   include/mesos/containerizer/containerizer.proto f16ccc89f83da28c413ccfa0687a06b7515a605c 
>   include/mesos/mesos.proto 4a16be1f570769f3ce42a50a9da9f4fb1c227999 
>   include/mesos/slave/isolator.proto 9d38a25470a79af1eda9122c037c82b8cbbad6ed 
>   src/common/protobuf_utils.hpp 8793851fb927ab1326da6b6a424b3c6a75eb5001 
>   src/common/protobuf_utils.cpp c1e8e011e6d0b8e1cd8a836e3168685ec401b21b 
>   src/slave/containerizer/docker.cpp 6c975f904178e01797b67628a2d471ec7b3b1fbf 
>   src/slave/containerizer/external_containerizer.cpp 211649201777f0d2ce802a865090129eacdd53be 
>   src/slave/containerizer/isolators/cgroups/mem.cpp 89c86beb9227eb8a6e70a413e7b3934add652981 
>   src/slave/containerizer/isolators/posix/disk.cpp c324c79f8d598095d07fbcb26e806a0978c2a520 
>   src/slave/containerizer/mesos/containerizer.hpp 4c1419290645ad4c44360a81618a6cea7ad190df 
>   src/slave/containerizer/mesos/containerizer.cpp b904b2d88e9b62fa4ba312c4569a4d89b0dc6052 
>   src/slave/slave.hpp 18be4f8188ad34ef4d0aa4b5eba241053d071476 
>   src/slave/slave.cpp d1c9977feeb30ad43586a4560eed155865d27a6c 
>   src/tests/containerizer.cpp 1f743155526192569dd61a47ab67d8e58aab205a 
>   src/tests/containerizer/docker_containerizer_tests.cpp 8771ef661039310d79845513ea4602e15b2ad13d 
>   src/tests/containerizer/mesos_containerizer_tests.cpp 5bc7d408bda0c249e1b66747d8bd87e688362e6c 
>   src/tests/slave_recovery_tests.cpp fd285fbba0ec08fe2afbdd8da0b0b451126da0eb 
>   src/tests/slave_tests.cpp dccdbb0ec5352df5da63d5ef7261bfc7fd599acb 
> 
> Diff: https://reviews.apache.org/r/38746/diff/
> 
> 
> Testing
> -------
> 
> sudo make check
> 
> tested with Docker as well.
> 
> 
> Thanks,
> 
> Jie Yu
> 
>


Re: Review Request 38746: Added TaskStatus::Reason to containerizer Termination message.

Posted by Ben Mahler <be...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38746/#review102111
-----------------------------------------------------------


Overall looks good, some of the comments I made can be addressed in separate reviews.


include/mesos/containerizer/containerizer.proto (lines 88 - 92)
<https://reviews.apache.org/r/38746/#comment159628>

    Why do you say "executor" here but termination is about "container" above?
    
    Also, can you:
    
    s/task_state/state/
    s/task_status_reasons/reasons/
    
    s/'reason'/'reasons'/
    s/of a status update for those pending/unterminated tasks/of status updates for non-terminal tasks/



include/mesos/mesos.proto (line 1100)
<https://reviews.apache.org/r/38746/#comment159647>

    0 should have been UNKNOWN, ugh :(



include/mesos/mesos.proto (lines 1119 - 1122)
<https://reviews.apache.org/r/38746/#comment159626>

    Can you keep these alphabetical?
    
    It's a bit unfortunate that we don't have a common "REASON_CONTAINER_LIMITATION" prefix for the limitations. Lesson learned for the next time we decide to name enums to use use prefixing to group related enums...
    
    Do you want to add a generic limitation? I believe it's ok to rename these from a conversation with vinod, given they were intended for metrics rather than control flow in schedulers:
    
    REASON_CONTAINER_LIMITATION
    REASON_CONTAINER_LIMITATION_DISK
    REASON_CONTAINER_LIMITATION_MEMORY
    
    Note the presence of a general limitation that custom isolators can use.



include/mesos/slave/isolator.proto (line 42)
<https://reviews.apache.org/r/38746/#comment159631>

    How about: s/for those tasks that are in non-terminal status when the container is terminated/for any remaining non-terminal tasks/ ?



include/mesos/slave/isolator.proto (line 44)
<https://reviews.apache.org/r/38746/#comment159630>

    Why not s/task_status_reason/reason/ ?



src/slave/containerizer/mesos/containerizer.cpp (lines 1235 - 1237)
<https://reviews.apache.org/r/38746/#comment159645>

    Can you use the arrow operator here?
    
    status->isSome
    status->get



src/slave/containerizer/mesos/containerizer.cpp (line 1243)
<https://reviews.apache.org/r/38746/#comment159643>

    ! empty() ?



src/slave/containerizer/mesos/containerizer.cpp (line 1257)
<https://reviews.apache.org/r/38746/#comment159644>

    Why did you need the trim here?



src/slave/slave.hpp (lines 306 - 307)
<https://reviews.apache.org/r/38746/#comment159632>

    Could you wrap on the next line, as is done with getExecutorInfo?



src/slave/slave.hpp (lines 629 - 636)
<https://reviews.apache.org/r/38746/#comment159634>

    Hm.. I think people mind get a bit confused when they encounter this, how about we add a comment to guide them and rename it like the following:
    
    ```
    // When the agent initiates a destroyal of the container,
    // we expect a termination to occur. The 'pendingTermation'
    // indicates why the agent initiated the destruction and
    // will influence the information sent in the status updates
    // for any remaining non-terminal tasks.
    Option<Termination> pendingTermination;
    ```



src/slave/slave.cpp (line 1661)
<https://reviews.apache.org/r/38746/#comment159648>

    Shall we rename this to 'update' to be a bit clearer?



src/slave/slave.cpp (line 2656)
<https://reviews.apache.org/r/38746/#comment159649>

    Ditto here.



src/slave/slave.cpp (lines 2668 - 2678)
<https://reviews.apache.org/r/38746/#comment159661>

    Weird that some of the existing code paths don't set the executor state to TERMINATING, do we need a function to make the destroyal code consistent?
    
    Shall we follow up in a separate patch?



src/slave/slave.cpp (line 2716)
<https://reviews.apache.org/r/38746/#comment159650>

    Is it easy to add the value of the timeout here? E.g.
    
    ```
    Executor did not re-register within 10secs
    ```



src/slave/slave.cpp (line 2931)
<https://reviews.apache.org/r/38746/#comment159651>

    Ditto here.



src/slave/slave.cpp (line 2955)
<https://reviews.apache.org/r/38746/#comment159652>

    Please use the arrow operator going forward :)
    
    future->isFailed
    future->failure



src/slave/slave.cpp (line 3248)
<https://reviews.apache.org/r/38746/#comment159635>

    Mind avoiding the ternary just for clarity? Up to you.



src/slave/slave.cpp (line 3372)
<https://reviews.apache.org/r/38746/#comment159653>

    Ditto here, this is 'launched'?



src/slave/slave.cpp (line 3398)
<https://reviews.apache.org/r/38746/#comment159662>

    Odd that we don't go set the state to TERMINATING here as well, for example.



src/slave/slave.cpp (lines 3954 - 3955)
<https://reviews.apache.org/r/38746/#comment159654>

    Ditto here about adding the value if possible



src/slave/slave.cpp (line 4398)
<https://reviews.apache.org/r/38746/#comment159667>

    Odd that this isn't 'corrections', looks like we don't need the 'corrections' variable below, not sure why we bother with the extra variable here. But this is for another patch..



src/slave/slave.cpp (lines 4709 - 4740)
<https://reviews.apache.org/r/38746/#comment159687>

    Can you use the arrow operator here on 'termination'?



src/slave/slave.cpp (line 4720)
<https://reviews.apache.org/r/38746/#comment159688>

    How about:
    
    ```
    (!termination->reasons().empty)
    ```
    
    instead of the size check



src/tests/slave_recovery_tests.cpp (lines 594 - 596)
<https://reviews.apache.org/r/38746/#comment159638>

    Here and the one below in this file, could you also add assertions for the expected reason?
    
    It would be nice to do a pass adding in more reason assertions across the tests to validate this change, either in this patch or in a follow up patch (the latter ideally).


- Ben Mahler


On Oct. 9, 2015, 1:28 a.m., Jie Yu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/38746/
> -----------------------------------------------------------
> 
> (Updated Oct. 9, 2015, 1:28 a.m.)
> 
> 
> Review request for mesos, Ben Mahler, Timothy Chen, and Vinod Kone.
> 
> 
> Bugs: MESOS-2035
>     https://issues.apache.org/jira/browse/MESOS-2035
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Added TaskStatus::Reason to containerizer Termination message.
> 
> The following doc summarized the problem and proposed a solution:
> https://docs.google.com/document/d/1klGDAu5yBVf-CGWLqvELLIfxLfRaisGkhi6Gn7952-4/edit?usp=sharing
> 
> 
> Diffs
> -----
> 
>   include/mesos/containerizer/containerizer.proto f16ccc89f83da28c413ccfa0687a06b7515a605c 
>   include/mesos/mesos.proto 4a16be1f570769f3ce42a50a9da9f4fb1c227999 
>   include/mesos/slave/isolator.proto 9d38a25470a79af1eda9122c037c82b8cbbad6ed 
>   src/common/protobuf_utils.hpp 8793851fb927ab1326da6b6a424b3c6a75eb5001 
>   src/common/protobuf_utils.cpp c1e8e011e6d0b8e1cd8a836e3168685ec401b21b 
>   src/slave/containerizer/docker.cpp 6c975f904178e01797b67628a2d471ec7b3b1fbf 
>   src/slave/containerizer/external_containerizer.cpp 211649201777f0d2ce802a865090129eacdd53be 
>   src/slave/containerizer/isolators/cgroups/mem.cpp 89c86beb9227eb8a6e70a413e7b3934add652981 
>   src/slave/containerizer/isolators/posix/disk.cpp c324c79f8d598095d07fbcb26e806a0978c2a520 
>   src/slave/containerizer/mesos/containerizer.hpp 4c1419290645ad4c44360a81618a6cea7ad190df 
>   src/slave/containerizer/mesos/containerizer.cpp b904b2d88e9b62fa4ba312c4569a4d89b0dc6052 
>   src/slave/slave.hpp 18be4f8188ad34ef4d0aa4b5eba241053d071476 
>   src/slave/slave.cpp d1c9977feeb30ad43586a4560eed155865d27a6c 
>   src/tests/containerizer.cpp 1f743155526192569dd61a47ab67d8e58aab205a 
>   src/tests/containerizer/docker_containerizer_tests.cpp 8771ef661039310d79845513ea4602e15b2ad13d 
>   src/tests/containerizer/mesos_containerizer_tests.cpp 5bc7d408bda0c249e1b66747d8bd87e688362e6c 
>   src/tests/slave_recovery_tests.cpp fd285fbba0ec08fe2afbdd8da0b0b451126da0eb 
>   src/tests/slave_tests.cpp dccdbb0ec5352df5da63d5ef7261bfc7fd599acb 
> 
> Diff: https://reviews.apache.org/r/38746/diff/
> 
> 
> Testing
> -------
> 
> sudo make check
> 
> tested with Docker as well.
> 
> 
> Thanks,
> 
> Jie Yu
> 
>


Re: Review Request 38746: Added TaskStatus::Reason to containerizer Termination message.

Posted by Jie Yu <yu...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38746/
-----------------------------------------------------------

(Updated Oct. 9, 2015, 1:28 a.m.)


Review request for mesos, Ben Mahler, Timothy Chen, and Vinod Kone.


Bugs: MESOS-2035
    https://issues.apache.org/jira/browse/MESOS-2035


Repository: mesos


Description
-------

Added TaskStatus::Reason to containerizer Termination message.

The following doc summarized the problem and proposed a solution:
https://docs.google.com/document/d/1klGDAu5yBVf-CGWLqvELLIfxLfRaisGkhi6Gn7952-4/edit?usp=sharing


Diffs
-----

  include/mesos/containerizer/containerizer.proto f16ccc89f83da28c413ccfa0687a06b7515a605c 
  include/mesos/mesos.proto 4a16be1f570769f3ce42a50a9da9f4fb1c227999 
  include/mesos/slave/isolator.proto 9d38a25470a79af1eda9122c037c82b8cbbad6ed 
  src/common/protobuf_utils.hpp 8793851fb927ab1326da6b6a424b3c6a75eb5001 
  src/common/protobuf_utils.cpp c1e8e011e6d0b8e1cd8a836e3168685ec401b21b 
  src/slave/containerizer/docker.cpp 6c975f904178e01797b67628a2d471ec7b3b1fbf 
  src/slave/containerizer/external_containerizer.cpp 211649201777f0d2ce802a865090129eacdd53be 
  src/slave/containerizer/isolators/cgroups/mem.cpp 89c86beb9227eb8a6e70a413e7b3934add652981 
  src/slave/containerizer/isolators/posix/disk.cpp c324c79f8d598095d07fbcb26e806a0978c2a520 
  src/slave/containerizer/mesos/containerizer.hpp 4c1419290645ad4c44360a81618a6cea7ad190df 
  src/slave/containerizer/mesos/containerizer.cpp b904b2d88e9b62fa4ba312c4569a4d89b0dc6052 
  src/slave/slave.hpp 18be4f8188ad34ef4d0aa4b5eba241053d071476 
  src/slave/slave.cpp d1c9977feeb30ad43586a4560eed155865d27a6c 
  src/tests/containerizer.cpp 1f743155526192569dd61a47ab67d8e58aab205a 
  src/tests/containerizer/docker_containerizer_tests.cpp 8771ef661039310d79845513ea4602e15b2ad13d 
  src/tests/containerizer/mesos_containerizer_tests.cpp 5bc7d408bda0c249e1b66747d8bd87e688362e6c 
  src/tests/slave_recovery_tests.cpp fd285fbba0ec08fe2afbdd8da0b0b451126da0eb 
  src/tests/slave_tests.cpp dccdbb0ec5352df5da63d5ef7261bfc7fd599acb 

Diff: https://reviews.apache.org/r/38746/diff/


Testing (updated)
-------

sudo make check

tested with Docker as well.


Thanks,

Jie Yu


Re: Review Request 38746: Added TaskStatus::Reason to containerizer Termination message.

Posted by Jie Yu <yu...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38746/
-----------------------------------------------------------

(Updated Oct. 9, 2015, 1:27 a.m.)


Review request for mesos, Ben Mahler, Timothy Chen, and Vinod Kone.


Bugs: MESOS-2035
    https://issues.apache.org/jira/browse/MESOS-2035


Repository: mesos


Description (updated)
-------

Added TaskStatus::Reason to containerizer Termination message.

The following doc summarized the problem and proposed a solution:
https://docs.google.com/document/d/1klGDAu5yBVf-CGWLqvELLIfxLfRaisGkhi6Gn7952-4/edit?usp=sharing


Diffs
-----

  include/mesos/containerizer/containerizer.proto f16ccc89f83da28c413ccfa0687a06b7515a605c 
  include/mesos/mesos.proto 4a16be1f570769f3ce42a50a9da9f4fb1c227999 
  include/mesos/slave/isolator.proto 9d38a25470a79af1eda9122c037c82b8cbbad6ed 
  src/common/protobuf_utils.hpp 8793851fb927ab1326da6b6a424b3c6a75eb5001 
  src/common/protobuf_utils.cpp c1e8e011e6d0b8e1cd8a836e3168685ec401b21b 
  src/slave/containerizer/docker.cpp 6c975f904178e01797b67628a2d471ec7b3b1fbf 
  src/slave/containerizer/external_containerizer.cpp 211649201777f0d2ce802a865090129eacdd53be 
  src/slave/containerizer/isolators/cgroups/mem.cpp 89c86beb9227eb8a6e70a413e7b3934add652981 
  src/slave/containerizer/isolators/posix/disk.cpp c324c79f8d598095d07fbcb26e806a0978c2a520 
  src/slave/containerizer/mesos/containerizer.hpp 4c1419290645ad4c44360a81618a6cea7ad190df 
  src/slave/containerizer/mesos/containerizer.cpp b904b2d88e9b62fa4ba312c4569a4d89b0dc6052 
  src/slave/slave.hpp 18be4f8188ad34ef4d0aa4b5eba241053d071476 
  src/slave/slave.cpp d1c9977feeb30ad43586a4560eed155865d27a6c 
  src/tests/containerizer.cpp 1f743155526192569dd61a47ab67d8e58aab205a 
  src/tests/containerizer/docker_containerizer_tests.cpp 8771ef661039310d79845513ea4602e15b2ad13d 
  src/tests/containerizer/mesos_containerizer_tests.cpp 5bc7d408bda0c249e1b66747d8bd87e688362e6c 
  src/tests/slave_recovery_tests.cpp fd285fbba0ec08fe2afbdd8da0b0b451126da0eb 
  src/tests/slave_tests.cpp dccdbb0ec5352df5da63d5ef7261bfc7fd599acb 

Diff: https://reviews.apache.org/r/38746/diff/


Testing
-------

sudo make check

@tnachen, I need you to help test this patch with docker containerizer.


Thanks,

Jie Yu


Re: Review Request 38746: Added TaskStatus::Reason to containerizer Termination message.

Posted by Jie Yu <yu...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38746/
-----------------------------------------------------------

(Updated Oct. 9, 2015, 1:27 a.m.)


Review request for mesos, Ben Mahler, Timothy Chen, and Vinod Kone.


Changes
-------

Did a few changes according to this doc:
https://docs.google.com/document/d/1klGDAu5yBVf-CGWLqvELLIfxLfRaisGkhi6Gn7952-4/edit?usp=sharing


Bugs: MESOS-2035
    https://issues.apache.org/jira/browse/MESOS-2035


Repository: mesos


Description
-------

Added TaskStatus::Reason to containerizer Termination message.


Diffs (updated)
-----

  include/mesos/containerizer/containerizer.proto f16ccc89f83da28c413ccfa0687a06b7515a605c 
  include/mesos/mesos.proto 4a16be1f570769f3ce42a50a9da9f4fb1c227999 
  include/mesos/slave/isolator.proto 9d38a25470a79af1eda9122c037c82b8cbbad6ed 
  src/common/protobuf_utils.hpp 8793851fb927ab1326da6b6a424b3c6a75eb5001 
  src/common/protobuf_utils.cpp c1e8e011e6d0b8e1cd8a836e3168685ec401b21b 
  src/slave/containerizer/docker.cpp 6c975f904178e01797b67628a2d471ec7b3b1fbf 
  src/slave/containerizer/external_containerizer.cpp 211649201777f0d2ce802a865090129eacdd53be 
  src/slave/containerizer/isolators/cgroups/mem.cpp 89c86beb9227eb8a6e70a413e7b3934add652981 
  src/slave/containerizer/isolators/posix/disk.cpp c324c79f8d598095d07fbcb26e806a0978c2a520 
  src/slave/containerizer/mesos/containerizer.hpp 4c1419290645ad4c44360a81618a6cea7ad190df 
  src/slave/containerizer/mesos/containerizer.cpp b904b2d88e9b62fa4ba312c4569a4d89b0dc6052 
  src/slave/slave.hpp 18be4f8188ad34ef4d0aa4b5eba241053d071476 
  src/slave/slave.cpp d1c9977feeb30ad43586a4560eed155865d27a6c 
  src/tests/containerizer.cpp 1f743155526192569dd61a47ab67d8e58aab205a 
  src/tests/containerizer/docker_containerizer_tests.cpp 8771ef661039310d79845513ea4602e15b2ad13d 
  src/tests/containerizer/mesos_containerizer_tests.cpp 5bc7d408bda0c249e1b66747d8bd87e688362e6c 
  src/tests/slave_recovery_tests.cpp fd285fbba0ec08fe2afbdd8da0b0b451126da0eb 
  src/tests/slave_tests.cpp dccdbb0ec5352df5da63d5ef7261bfc7fd599acb 

Diff: https://reviews.apache.org/r/38746/diff/


Testing
-------

sudo make check

@tnachen, I need you to help test this patch with docker containerizer.


Thanks,

Jie Yu


Re: Review Request 38746: Added TaskStatus::Reason to containerizer Termination message.

Posted by Mesos ReviewBot <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38746/#review100526
-----------------------------------------------------------


Patch looks great!

Reviews applied: [38746]

All tests passed.

- Mesos ReviewBot


On Sept. 25, 2015, 2:14 a.m., Jie Yu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/38746/
> -----------------------------------------------------------
> 
> (Updated Sept. 25, 2015, 2:14 a.m.)
> 
> 
> Review request for mesos, Ben Mahler, Timothy Chen, and Vinod Kone.
> 
> 
> Bugs: MESOS-2035
>     https://issues.apache.org/jira/browse/MESOS-2035
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Added TaskStatus::Reason to containerizer Termination message.
> 
> 
> Diffs
> -----
> 
>   include/mesos/containerizer/containerizer.proto f16ccc89f83da28c413ccfa0687a06b7515a605c 
>   include/mesos/mesos.proto 4a16be1f570769f3ce42a50a9da9f4fb1c227999 
>   include/mesos/slave/isolator.proto 9d38a25470a79af1eda9122c037c82b8cbbad6ed 
>   src/common/protobuf_utils.hpp 8793851fb927ab1326da6b6a424b3c6a75eb5001 
>   src/common/protobuf_utils.cpp c1e8e011e6d0b8e1cd8a836e3168685ec401b21b 
>   src/slave/containerizer/docker.cpp efa37266368ee12fea9134b35ebc5047a2820f94 
>   src/slave/containerizer/isolators/cgroups/mem.cpp 89c86beb9227eb8a6e70a413e7b3934add652981 
>   src/slave/containerizer/isolators/posix/disk.cpp c324c79f8d598095d07fbcb26e806a0978c2a520 
>   src/slave/containerizer/mesos/containerizer.hpp 4c1419290645ad4c44360a81618a6cea7ad190df 
>   src/slave/containerizer/mesos/containerizer.cpp b904b2d88e9b62fa4ba312c4569a4d89b0dc6052 
>   src/slave/slave.cpp d1c9977feeb30ad43586a4560eed155865d27a6c 
>   src/tests/containerizer/docker_containerizer_tests.cpp 8771ef661039310d79845513ea4602e15b2ad13d 
>   src/tests/containerizer/mesos_containerizer_tests.cpp 5bc7d408bda0c249e1b66747d8bd87e688362e6c 
>   src/tests/slave_recovery_tests.cpp fd285fbba0ec08fe2afbdd8da0b0b451126da0eb 
>   src/tests/slave_tests.cpp dccdbb0ec5352df5da63d5ef7261bfc7fd599acb 
> 
> Diff: https://reviews.apache.org/r/38746/diff/
> 
> 
> Testing
> -------
> 
> sudo make check
> 
> @tnachen, I need you to help test this patch with docker containerizer.
> 
> 
> Thanks,
> 
> Jie Yu
> 
>