You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by James Peach <jo...@gmail.com> on 2018/06/27 06:42:41 UTC
Re: mesos git commit: Made `gpu/nvidia` isolator works with
`cgroups/all` isolation option.
Do we still need this check? The order of built-in isolators is now fixed, so do we still need to verify this ordering?
> On Jun 27, 2018, at 4:17 PM, gilbert@apache.org wrote:
>
> Repository: mesos
> Updated Branches:
> refs/heads/master 2e913d545 -> b581136bd
>
>
> Made `gpu/nvidia` isolator works with `cgroups/all` isolation option.
>
> Review: https://reviews.apache.org/r/67743/
>
>
> Project: http://git-wip-us.apache.org/repos/asf/mesos/repo
> Commit: http://git-wip-us.apache.org/repos/asf/mesos/commit/b581136b
> Tree: http://git-wip-us.apache.org/repos/asf/mesos/tree/b581136b
> Diff: http://git-wip-us.apache.org/repos/asf/mesos/diff/b581136b
>
> Branch: refs/heads/master
> Commit: b581136bd5d28ea74e410c7a57ac10d05c334b5b
> Parents: 2e913d5
> Author: Qian Zhang <zh...@gmail.com>
> Authored: Tue Jun 26 23:16:36 2018 -0700
> Committer: Gilbert Song <so...@gmail.com>
> Committed: Tue Jun 26 23:16:36 2018 -0700
>
> ----------------------------------------------------------------------
> .../mesos/isolators/gpu/isolator.cpp | 33 +++++++++++++++++---
> 1 file changed, 29 insertions(+), 4 deletions(-)
> ----------------------------------------------------------------------
>
>
> http://git-wip-us.apache.org/repos/asf/mesos/blob/b581136b/src/slave/containerizer/mesos/isolators/gpu/isolator.cpp
> ----------------------------------------------------------------------
> diff --git a/src/slave/containerizer/mesos/isolators/gpu/isolator.cpp b/src/slave/containerizer/mesos/isolators/gpu/isolator.cpp
> index d79c940..5066882 100644
> --- a/src/slave/containerizer/mesos/isolators/gpu/isolator.cpp
> +++ b/src/slave/containerizer/mesos/isolators/gpu/isolator.cpp
> @@ -102,20 +102,45 @@ Try<Isolator*> NvidiaGpuIsolatorProcess::create(
> const Flags& flags,
> const NvidiaComponents& components)
> {
> - // Make sure both the 'cgroups/devices' isolator and the
> - // 'filesystem/linux' isolators are present and precede the GPU
> - // isolator.
> + // Make sure both the 'cgroups/devices' (or 'cgroups/all') isolator and the
> + // 'filesystem/linux' isolators are present and precede the GPU isolator.
> vector<string> tokens = strings::tokenize(flags.isolation, ",");
>
> auto gpuIsolator =
> std::find(tokens.begin(), tokens.end(), "gpu/nvidia");
> +
> auto devicesIsolator =
> std::find(tokens.begin(), tokens.end(), "cgroups/devices");
> +
> + auto cgroupsAllIsolator =
> + std::find(tokens.begin(), tokens.end(), "cgroups/all");
> +
> auto filesystemIsolator =
> std::find(tokens.begin(), tokens.end(), "filesystem/linux");
>
> CHECK(gpuIsolator != tokens.end());
>
> + if (cgroupsAllIsolator != tokens.end()) {
> + // The reason that we need to check if `devices` cgroups subsystem is
> + // enabled is, when `cgroups/all` is specified in the `--isolation` agent
> + // flag, cgroups isolator will only load the enabled subsystems. So if
> + // `cgroups/all` is specified but `devices` is not enabled, cgroups isolator
> + // will not load `devices` subsystem in which case we should error out.
> + Try<bool> result = cgroups::enabled("devices");
> + if (result.isError()) {
> + return Error(
> + "Failed to check if the `devices` cgroups subsystem"
> + " is enabled by kernel: " + result.error());
> + } else if (!result.get()) {
> + return Error(
> + "The `devices` cgroups subsystem is not enabled by the kernel");
> + }
> +
> + if (devicesIsolator > cgroupsAllIsolator) {
> + devicesIsolator = cgroupsAllIsolator;
> + }
> + }
> +
> if (devicesIsolator == tokens.end()) {
> return Error("The 'cgroups/devices' isolator must be enabled in"
> " order to use the 'gpu/nvidia' isolator");
> @@ -127,7 +152,7 @@ Try<Isolator*> NvidiaGpuIsolatorProcess::create(
> }
>
> if (devicesIsolator > gpuIsolator) {
> - return Error("'cgroups/devices' must precede 'gpu/nvidia'"
> + return Error("'cgroups/devices' or 'cgroups/all' must precede 'gpu/nvidia'"
> " in the --isolation flag");
> }
>
>
Re: mesos git commit: Made `gpu/nvidia` isolator works with
`cgroups/all` isolation option.
Posted by Qian Zhang <zh...@gmail.com>.
Thanks James for reminding this!
You are right, we do not need this check, I have posted a patch (
https://reviews.apache.org/r/67767/) to remove it, can you help review it?
Regards,
Qian Zhang
On Wed, Jun 27, 2018 at 2:42 PM, James Peach <jo...@gmail.com> wrote:
> Do we still need this check? The order of built-in isolators is now fixed,
> so do we still need to verify this ordering?
>
> > On Jun 27, 2018, at 4:17 PM, gilbert@apache.org wrote:
> >
> > Repository: mesos
> > Updated Branches:
> > refs/heads/master 2e913d545 -> b581136bd
> >
> >
> > Made `gpu/nvidia` isolator works with `cgroups/all` isolation option.
> >
> > Review: https://reviews.apache.org/r/67743/
> >
> >
> > Project: http://git-wip-us.apache.org/repos/asf/mesos/repo
> > Commit: http://git-wip-us.apache.org/repos/asf/mesos/commit/b581136b
> > Tree: http://git-wip-us.apache.org/repos/asf/mesos/tree/b581136b
> > Diff: http://git-wip-us.apache.org/repos/asf/mesos/diff/b581136b
> >
> > Branch: refs/heads/master
> > Commit: b581136bd5d28ea74e410c7a57ac10d05c334b5b
> > Parents: 2e913d5
> > Author: Qian Zhang <zh...@gmail.com>
> > Authored: Tue Jun 26 23:16:36 2018 -0700
> > Committer: Gilbert Song <so...@gmail.com>
> > Committed: Tue Jun 26 23:16:36 2018 -0700
> >
> > ----------------------------------------------------------------------
> > .../mesos/isolators/gpu/isolator.cpp | 33
> +++++++++++++++++---
> > 1 file changed, 29 insertions(+), 4 deletions(-)
> > ----------------------------------------------------------------------
> >
> >
> > http://git-wip-us.apache.org/repos/asf/mesos/blob/b581136b/
> src/slave/containerizer/mesos/isolators/gpu/isolator.cpp
> > ----------------------------------------------------------------------
> > diff --git a/src/slave/containerizer/mesos/isolators/gpu/isolator.cpp
> b/src/slave/containerizer/mesos/isolators/gpu/isolator.cpp
> > index d79c940..5066882 100644
> > --- a/src/slave/containerizer/mesos/isolators/gpu/isolator.cpp
> > +++ b/src/slave/containerizer/mesos/isolators/gpu/isolator.cpp
> > @@ -102,20 +102,45 @@ Try<Isolator*> NvidiaGpuIsolatorProcess::create(
> > const Flags& flags,
> > const NvidiaComponents& components)
> > {
> > - // Make sure both the 'cgroups/devices' isolator and the
> > - // 'filesystem/linux' isolators are present and precede the GPU
> > - // isolator.
> > + // Make sure both the 'cgroups/devices' (or 'cgroups/all') isolator
> and the
> > + // 'filesystem/linux' isolators are present and precede the GPU
> isolator.
> > vector<string> tokens = strings::tokenize(flags.isolation, ",");
> >
> > auto gpuIsolator =
> > std::find(tokens.begin(), tokens.end(), "gpu/nvidia");
> > +
> > auto devicesIsolator =
> > std::find(tokens.begin(), tokens.end(), "cgroups/devices");
> > +
> > + auto cgroupsAllIsolator =
> > + std::find(tokens.begin(), tokens.end(), "cgroups/all");
> > +
> > auto filesystemIsolator =
> > std::find(tokens.begin(), tokens.end(), "filesystem/linux");
> >
> > CHECK(gpuIsolator != tokens.end());
> >
> > + if (cgroupsAllIsolator != tokens.end()) {
> > + // The reason that we need to check if `devices` cgroups subsystem
> is
> > + // enabled is, when `cgroups/all` is specified in the `--isolation`
> agent
> > + // flag, cgroups isolator will only load the enabled subsystems. So
> if
> > + // `cgroups/all` is specified but `devices` is not enabled, cgroups
> isolator
> > + // will not load `devices` subsystem in which case we should error
> out.
> > + Try<bool> result = cgroups::enabled("devices");
> > + if (result.isError()) {
> > + return Error(
> > + "Failed to check if the `devices` cgroups subsystem"
> > + " is enabled by kernel: " + result.error());
> > + } else if (!result.get()) {
> > + return Error(
> > + "The `devices` cgroups subsystem is not enabled by the
> kernel");
> > + }
> > +
> > + if (devicesIsolator > cgroupsAllIsolator) {
> > + devicesIsolator = cgroupsAllIsolator;
> > + }
> > + }
> > +
> > if (devicesIsolator == tokens.end()) {
> > return Error("The 'cgroups/devices' isolator must be enabled in"
> > " order to use the 'gpu/nvidia' isolator");
> > @@ -127,7 +152,7 @@ Try<Isolator*> NvidiaGpuIsolatorProcess::create(
> > }
> >
> > if (devicesIsolator > gpuIsolator) {
> > - return Error("'cgroups/devices' must precede 'gpu/nvidia'"
> > + return Error("'cgroups/devices' or 'cgroups/all' must precede
> 'gpu/nvidia'"
> > " in the --isolation flag");
> > }
> >
> >
>
>