You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Ben Mahler <be...@gmail.com> on 2015/06/02 00:00:44 UTC

Re: Review Request 34310: Use IDLE scheduling for revocable CPU in cgroups isolator.


> On May 16, 2015, 8:03 p.m., Joris Van Remoortere wrote:
> > Hi Ian,
> > 
> > I'm wondering about the `Note` regarding only setting the scheduling policy to IDLE if the initial resources are revocable. I think this exposes many scenarios where the isolator will seem `enabled` to the operator, but won't actually be isolating correctly.
> > I don't think it is uncommon to launch an executor with non-revocable resources, and then later launch BE tasks with revocable resources. This pattern would always lead to the isolator being in-effective right?
> > 
> > I am not super familiar with the rules around the isolators, but why can we not adjust the policy within the update() call as opposed to in the isolate() function?
> 
> Ian Downes wrote:
>     This is not a limitation of the isolator interface but a complexity of setting the scheduling policy because it's task rather than cgroup based. We'd need to ensure we set it on all tasks and forking children. One approach could be to briefly freeze the cgroup and set the scheduling policy but I'm concerned that if the slave dies during the freeze/set/unfreeze operation then we've frozen a running container. Otherwise, we need some hackery as used in os:killtree() to ensure we get every task.
>     
>     We need to do it in isolate() regardless because that's the initial forked child which will later exec the executor and we may never get a subsequent update(). I'll look at adding it also to the update() call to capture updates to the resources after the executor has launched.
>     
>     I presume we also want to support the reverse update from having some revocable to having no revocable CPU, i.e., SCHED_IDLE -> SCHED_OTHER?

Why does it need to be as complicated as os::killtree? If we rely on cgroups to provide the list of processes, we shouldn't have to do the stopping and tree traversal logic. Just have to iterate on cgroup::processes until everything is set, no? Are you worried about the fork-bomb case where we iterate forever, unless the freezer is eventually used?


- Ben


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34310/#review84039
-----------------------------------------------------------


On May 29, 2015, 7:57 p.m., Ian Downes wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34310/
> -----------------------------------------------------------
> 
> (Updated May 29, 2015, 7:57 p.m.)
> 
> 
> Review request for mesos, Joris Van Remoortere, Niklas Nielsen, and Vinod Kone.
> 
> 
> Bugs: MESOS-2652
>     https://issues.apache.org/jira/browse/MESOS-2652
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Use IDLE scheduling for revocable CPU in cgroups isolator.
> 
> 
> Diffs
> -----
> 
>   src/slave/containerizer/isolators/cgroups/cpushare.hpp ff4a9dbdb1b655e71bf87dcee8fe62433d396f52 
>   src/slave/containerizer/isolators/cgroups/cpushare.cpp e3b5e2aa8ba06eec82b929144627d044f6b19d9d 
>   src/slave/flags.hpp 944ed794b6dc8a2d3712145a15b6b27cebf26a22 
>   src/slave/flags.cpp 6b7c61ea1a9f3b5d401d2aec935188a1b70c1738 
>   src/tests/isolator_tests.cpp 24c71b7906a92bdc84a38e88d6084ab09e3cf2ab 
> 
> Diff: https://reviews.apache.org/r/34310/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Ian Downes
> 
>