You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Qian Zhang <zh...@gmail.com> on 2020/02/25 01:43:01 UTC

Re: Review Request 71858: Set resource limits when launching executor container.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71858/
-----------------------------------------------------------

(Updated Feb. 25, 2020, 9:43 a.m.)


Review request for mesos, Andrei Budnik and Greg Mann.


Changes
-------

Accommodated infinite value.


Summary (updated)
-----------------

Set resource limits when launching executor container.


Bugs: MESOS-10046
    https://issues.apache.org/jira/browse/MESOS-10046


Repository: mesos


Description (updated)
-------

Set resource limits when launching executor container.


Diffs (updated)
-----

  src/slave/slave.hpp 03279db082bdba23dbfeb2d93081a908e609aec2 
  src/slave/slave.cpp cce275a504effae7a6b71dd333ce8a300d1ce5be 


Diff: https://reviews.apache.org/r/71858/diff/9/

Changes: https://reviews.apache.org/r/71858/diff/8-9/


Testing
-------


Thanks,

Qian Zhang


Re: Review Request 71858: Set resource limits when launching executor container.

Posted by Qian Zhang <zh...@gmail.com>.

> On Feb. 29, 2020, 1:38 a.m., Greg Mann wrote:
> > src/slave/slave.cpp
> > Lines 3201-3204 (original), 3285-3291 (patched)
> > <https://reviews.apache.org/r/71858/diff/10/?file=2212019#file2212019line3285>
> >
> >     So it looks like here, we will only set limits for the executor if they are explicitly set for its tasks. This is different than what we intend to do for task containers, since in that case we will set their limits equal to their requests if `share_cgroups==true` (modulo the value of the agent's `--cgroups_enable_cfs` flag).
> >     
> >     Should we do the same for the executor container?

> since in that case we will set their limits equal to their requests if share_cgroups==true (modulo the value of the agent's --cgroups_enable_cfs flag).

Did you mean `share_cgroups` is true or false? I think if it is true, we will neither set soft limit nor the hard limit for the individual task since it does not have its own cgroups.


I think we already do the similar for the executor container, I mean if no tasks have limits set, we will not set executor limit here, but we will always set executor resource requests to the sum of all its task's resource requests:
```
*executorInfo_.mutable_resources() =
      Resources(executorInfo.resources()) + totalTaskResources;
```
So in CPU subsystem of cgroup isolator, we will set executor container's CFS quota to its resource request if `--cgroups_enable_cfs` is true.


- Qian


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71858/#review219692
-----------------------------------------------------------


On Feb. 25, 2020, 9:46 a.m., Qian Zhang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71858/
> -----------------------------------------------------------
> 
> (Updated Feb. 25, 2020, 9:46 a.m.)
> 
> 
> Review request for mesos, Andrei Budnik and Greg Mann.
> 
> 
> Bugs: MESOS-10046
>     https://issues.apache.org/jira/browse/MESOS-10046
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Set resource limits when launching executor container.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 03279db082bdba23dbfeb2d93081a908e609aec2 
>   src/slave/slave.cpp cce275a504effae7a6b71dd333ce8a300d1ce5be 
> 
> 
> Diff: https://reviews.apache.org/r/71858/diff/10/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Qian Zhang
> 
>


Re: Review Request 71858: Set resource limits when launching executor container.

Posted by Greg Mann <gr...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71858/#review219692
-----------------------------------------------------------




src/slave/slave.cpp
Lines 3201-3204 (original), 3285-3291 (patched)
<https://reviews.apache.org/r/71858/#comment307909>

    So it looks like here, we will only set limits for the executor if they are explicitly set for its tasks. This is different than what we intend to do for task containers, since in that case we will set their limits equal to their requests if `share_cgroups==true` (modulo the value of the agent's `--cgroups_enable_cfs` flag).
    
    Should we do the same for the executor container?


- Greg Mann


On Feb. 25, 2020, 1:46 a.m., Qian Zhang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71858/
> -----------------------------------------------------------
> 
> (Updated Feb. 25, 2020, 1:46 a.m.)
> 
> 
> Review request for mesos, Andrei Budnik and Greg Mann.
> 
> 
> Bugs: MESOS-10046
>     https://issues.apache.org/jira/browse/MESOS-10046
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Set resource limits when launching executor container.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 03279db082bdba23dbfeb2d93081a908e609aec2 
>   src/slave/slave.cpp cce275a504effae7a6b71dd333ce8a300d1ce5be 
> 
> 
> Diff: https://reviews.apache.org/r/71858/diff/10/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Qian Zhang
> 
>


Re: Review Request 71858: Set resource limits when launching executor container.

Posted by Qian Zhang <zh...@gmail.com>.

> On Feb. 29, 2020, 1:55 a.m., Greg Mann wrote:
> > src/slave/slave.cpp
> > Lines 3201-3204 (original), 3285-3291 (patched)
> > <https://reviews.apache.org/r/71858/diff/10/?file=2212019#file2212019line3285>
> >
> >     I was just thinking about this more... IIUC, we actually want to always set limits for executors right now, don't we? We previously discussed that we have several cases to handle (default, command, and custom executors), and I believe in the case of custom executors, we need to be sure that limits are always set. Wouldn't the current code allow a custom executor to be launched with no resource constraints if the framework specifies `shared_cgroups==false` and does not set limits in the tasks?

> Wouldn't the current code allow a custom executor to be launched with no resource constraints if the framework specifies shared_cgroups==false and does not set limits in the tasks?

In that case, for memory subsystem we will set custom executor's limit to its request which is the sum of all its task's request, so custom executor will still be constrained. For CPU subsystem, it depends on the `--cgroups_enable_cfs` flag, if this flag is true, then similar with memory subsytem we will set custom executor's CFS quota to its request which is the sum of all its task's request, if this flag is false, then yes the custom executor will not be constrained which makes sense because its tasks need to be unconstrained as well (since they have no limit set).


- Qian


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71858/#review219696
-----------------------------------------------------------


On Feb. 25, 2020, 9:46 a.m., Qian Zhang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71858/
> -----------------------------------------------------------
> 
> (Updated Feb. 25, 2020, 9:46 a.m.)
> 
> 
> Review request for mesos, Andrei Budnik and Greg Mann.
> 
> 
> Bugs: MESOS-10046
>     https://issues.apache.org/jira/browse/MESOS-10046
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Set resource limits when launching executor container.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 03279db082bdba23dbfeb2d93081a908e609aec2 
>   src/slave/slave.cpp cce275a504effae7a6b71dd333ce8a300d1ce5be 
> 
> 
> Diff: https://reviews.apache.org/r/71858/diff/10/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Qian Zhang
> 
>


Re: Review Request 71858: Set resource limits when launching executor container.

Posted by Greg Mann <gr...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71858/#review219696
-----------------------------------------------------------




src/slave/slave.cpp
Lines 3201-3204 (original), 3285-3291 (patched)
<https://reviews.apache.org/r/71858/#comment307913>

    I was just thinking about this more... IIUC, we actually want to always set limits for executors right now, don't we? We previously discussed that we have several cases to handle (default, command, and custom executors), and I believe in the case of custom executors, we need to be sure that limits are always set. Wouldn't the current code allow a custom executor to be launched with no resource constraints if the framework specifies `shared_cgroups==false` and does not set limits in the tasks?


- Greg Mann


On Feb. 25, 2020, 1:46 a.m., Qian Zhang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71858/
> -----------------------------------------------------------
> 
> (Updated Feb. 25, 2020, 1:46 a.m.)
> 
> 
> Review request for mesos, Andrei Budnik and Greg Mann.
> 
> 
> Bugs: MESOS-10046
>     https://issues.apache.org/jira/browse/MESOS-10046
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Set resource limits when launching executor container.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 03279db082bdba23dbfeb2d93081a908e609aec2 
>   src/slave/slave.cpp cce275a504effae7a6b71dd333ce8a300d1ce5be 
> 
> 
> Diff: https://reviews.apache.org/r/71858/diff/10/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Qian Zhang
> 
>


Re: Review Request 71858: Set resource limits when launching executor container.

Posted by Greg Mann <gr...@mesosphere.io>.

> On Feb. 28, 2020, 5:57 p.m., Greg Mann wrote:
> > src/slave/slave.cpp
> > Lines 3262 (patched)
> > <https://reviews.apache.org/r/71858/diff/10/?file=2212019#file2212019line3262>
> >
> >     We have validation which ensures that executor CPU and mem are always greater than or equal to the minimum values, so I think this could be a `CHECK()`?
> >     
> >     https://github.com/apache/mesos/blob/998aee66bfedd1fe15bb1e1fc43a637fe91662a5/src/master/validation.cpp#L1842-L1859
> 
> Qian Zhang wrote:
>     Unfortunately, in `task::internal::validateExecutor()`, if there is no CPU or memory in executor resources, we will just log a warning message but not return an error.
>     
>     https://github.com/apache/mesos/blob/1.9.0/src/master/validation.cpp#L1638-L1648

Ah shoot, too bad :'(


- Greg


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71858/#review219697
-----------------------------------------------------------


On Feb. 25, 2020, 1:46 a.m., Qian Zhang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71858/
> -----------------------------------------------------------
> 
> (Updated Feb. 25, 2020, 1:46 a.m.)
> 
> 
> Review request for mesos, Andrei Budnik and Greg Mann.
> 
> 
> Bugs: MESOS-10046
>     https://issues.apache.org/jira/browse/MESOS-10046
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Set resource limits when launching executor container.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 03279db082bdba23dbfeb2d93081a908e609aec2 
>   src/slave/slave.cpp cce275a504effae7a6b71dd333ce8a300d1ce5be 
> 
> 
> Diff: https://reviews.apache.org/r/71858/diff/10/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Qian Zhang
> 
>


Re: Review Request 71858: Set resource limits when launching executor container.

Posted by Qian Zhang <zh...@gmail.com>.

> On Feb. 29, 2020, 1:57 a.m., Greg Mann wrote:
> > src/slave/slave.cpp
> > Lines 3262 (patched)
> > <https://reviews.apache.org/r/71858/diff/10/?file=2212019#file2212019line3262>
> >
> >     We have validation which ensures that executor CPU and mem are always greater than or equal to the minimum values, so I think this could be a `CHECK()`?
> >     
> >     https://github.com/apache/mesos/blob/998aee66bfedd1fe15bb1e1fc43a637fe91662a5/src/master/validation.cpp#L1842-L1859

Unfortunately, in `task::internal::validateExecutor()`, if there is no CPU or memory in executor resources, we will just log a warning message but not return an error.

https://github.com/apache/mesos/blob/1.9.0/src/master/validation.cpp#L1638-L1648


- Qian


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71858/#review219697
-----------------------------------------------------------


On Feb. 25, 2020, 9:46 a.m., Qian Zhang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71858/
> -----------------------------------------------------------
> 
> (Updated Feb. 25, 2020, 9:46 a.m.)
> 
> 
> Review request for mesos, Andrei Budnik and Greg Mann.
> 
> 
> Bugs: MESOS-10046
>     https://issues.apache.org/jira/browse/MESOS-10046
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Set resource limits when launching executor container.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 03279db082bdba23dbfeb2d93081a908e609aec2 
>   src/slave/slave.cpp cce275a504effae7a6b71dd333ce8a300d1ce5be 
> 
> 
> Diff: https://reviews.apache.org/r/71858/diff/10/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Qian Zhang
> 
>


Re: Review Request 71858: Set resource limits when launching executor container.

Posted by Greg Mann <gr...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71858/#review219697
-----------------------------------------------------------




src/slave/slave.cpp
Lines 3262 (patched)
<https://reviews.apache.org/r/71858/#comment307914>

    We have validation which ensures that executor CPU and mem are always greater than or equal to the minimum values, so I think this could be a `CHECK()`?
    
    https://github.com/apache/mesos/blob/998aee66bfedd1fe15bb1e1fc43a637fe91662a5/src/master/validation.cpp#L1842-L1859


- Greg Mann


On Feb. 25, 2020, 1:46 a.m., Qian Zhang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71858/
> -----------------------------------------------------------
> 
> (Updated Feb. 25, 2020, 1:46 a.m.)
> 
> 
> Review request for mesos, Andrei Budnik and Greg Mann.
> 
> 
> Bugs: MESOS-10046
>     https://issues.apache.org/jira/browse/MESOS-10046
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Set resource limits when launching executor container.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 03279db082bdba23dbfeb2d93081a908e609aec2 
>   src/slave/slave.cpp cce275a504effae7a6b71dd333ce8a300d1ce5be 
> 
> 
> Diff: https://reviews.apache.org/r/71858/diff/10/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Qian Zhang
> 
>


Re: Review Request 71858: Set resource limits when launching executor container.

Posted by Greg Mann <gr...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71858/#review219775
-----------------------------------------------------------


Ship it!




Ship It!

- Greg Mann


On Feb. 25, 2020, 1:46 a.m., Qian Zhang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71858/
> -----------------------------------------------------------
> 
> (Updated Feb. 25, 2020, 1:46 a.m.)
> 
> 
> Review request for mesos, Andrei Budnik and Greg Mann.
> 
> 
> Bugs: MESOS-10046
>     https://issues.apache.org/jira/browse/MESOS-10046
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Set resource limits when launching executor container.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 03279db082bdba23dbfeb2d93081a908e609aec2 
>   src/slave/slave.cpp cce275a504effae7a6b71dd333ce8a300d1ce5be 
> 
> 
> Diff: https://reviews.apache.org/r/71858/diff/10/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Qian Zhang
> 
>


Re: Review Request 71858: Set resource limits when launching executor container.

Posted by Greg Mann <gr...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71858/#review219881
-----------------------------------------------------------


Fix it, then Ship it!





src/slave/slave.cpp
Lines 191 (patched)
<https://reviews.apache.org/r/71858/#comment308124>

    Could you add a comment here explaining this helper?


- Greg Mann


On March 9, 2020, 2:04 p.m., Qian Zhang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71858/
> -----------------------------------------------------------
> 
> (Updated March 9, 2020, 2:04 p.m.)
> 
> 
> Review request for mesos, Andrei Budnik and Greg Mann.
> 
> 
> Bugs: MESOS-10046
>     https://issues.apache.org/jira/browse/MESOS-10046
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Set resource limits when launching executor container.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 03279db082bdba23dbfeb2d93081a908e609aec2 
>   src/slave/slave.cpp a914de4bae5630126ae26b9e7dee8c8784ce1ee0 
> 
> 
> Diff: https://reviews.apache.org/r/71858/diff/11/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Qian Zhang
> 
>


Re: Review Request 71858: Set resource limits when launching executor container.

Posted by Qian Zhang <zh...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71858/
-----------------------------------------------------------

(Updated March 16, 2020, 5:04 p.m.)


Review request for mesos, Andrei Budnik and Greg Mann.


Changes
-------

Addressed review comments.


Bugs: MESOS-10046
    https://issues.apache.org/jira/browse/MESOS-10046


Repository: mesos


Description
-------

Set resource limits when launching executor container.


Diffs (updated)
-----

  src/slave/slave.hpp 03279db082bdba23dbfeb2d93081a908e609aec2 
  src/slave/slave.cpp a914de4bae5630126ae26b9e7dee8c8784ce1ee0 


Diff: https://reviews.apache.org/r/71858/diff/12/

Changes: https://reviews.apache.org/r/71858/diff/11-12/


Testing
-------


Thanks,

Qian Zhang


Re: Review Request 71858: Set resource limits when launching executor container.

Posted by Qian Zhang <zh...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71858/
-----------------------------------------------------------

(Updated March 9, 2020, 10:04 p.m.)


Review request for mesos, Andrei Budnik and Greg Mann.


Changes
-------

Refactored the code to compute executor resource limits into `Slave::computeExecutorLimits()`.


Bugs: MESOS-10046
    https://issues.apache.org/jira/browse/MESOS-10046


Repository: mesos


Description
-------

Set resource limits when launching executor container.


Diffs (updated)
-----

  src/slave/slave.hpp 03279db082bdba23dbfeb2d93081a908e609aec2 
  src/slave/slave.cpp a914de4bae5630126ae26b9e7dee8c8784ce1ee0 


Diff: https://reviews.apache.org/r/71858/diff/11/

Changes: https://reviews.apache.org/r/71858/diff/10-11/


Testing
-------


Thanks,

Qian Zhang


Re: Review Request 71858: Set resource limits when launching executor container.

Posted by Qian Zhang <zh...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71858/
-----------------------------------------------------------

(Updated Feb. 25, 2020, 9:46 a.m.)


Review request for mesos, Andrei Budnik and Greg Mann.


Changes
-------

Minor change.


Bugs: MESOS-10046
    https://issues.apache.org/jira/browse/MESOS-10046


Repository: mesos


Description
-------

Set resource limits when launching executor container.


Diffs (updated)
-----

  src/slave/slave.hpp 03279db082bdba23dbfeb2d93081a908e609aec2 
  src/slave/slave.cpp cce275a504effae7a6b71dd333ce8a300d1ce5be 


Diff: https://reviews.apache.org/r/71858/diff/10/

Changes: https://reviews.apache.org/r/71858/diff/9-10/


Testing
-------


Thanks,

Qian Zhang