You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@myriad.apache.org by Darin Johnson <db...@gmail.com> on 2016/05/04 17:30:13 UTC

cgroups suggestions

I've been digging into groups support, there's a few things that are easy
fixes but a few things become problematic so I'd like to discuss.

First the code makes certain options dictated that can be placed in the
yarn-site.xml - this should be done to remove code and provide
flexibility.  That's easy.

The second involves the cgroup hierarchy and the cgroup mount point.  Here
the code attempts to create a hierarchy in $CGROUP_DIR/mesos/$TASK_ID.
This is problematic as mesos will not unmount the hierarchy when the task
finished (in this case the node manager), it is also therefore unable to
unmount it's own task hierarchy (This also creates the need to chmod a
number of directories as a superuser).  This leads to issues.  An
alternative approach would be to use the container-executor program
(already suid w/ yarn's group) to create the hierarchy as
$CGROUP_DIR/frameworkname if it doesn't exist, this may open another can of
worms as I haven't tested fully.

Any thoughts or suggestions would be appreciated.

Darin

Re: cgroups suggestions

Posted by Darin Johnson <db...@gmail.com>.
It turns out everything works if you set permissions appropriately of
$CGROUP_ROOT/mesos/$TASKID/ so the yarn user can write to the hierarchy.
Then all works exactly as expected.

I spent a while running through the container-executor code and when it
mounts a cgroup subsystem it changes the ownership of the hierarchy to the
yarn user, the original cgroups code of myriad attempted to do something
similar by chmoding the directory but assumed the yarn user work be a
member of group root, also when the code was written the chmod happened as
root, currently that is ineffective as the standard framework user does not
necessarily have permission to modify $CGROUP_ROOT/mesos/$TASKID.  However,
we have a mechanism for using a frameworksuperuser which can do this (my
current hack).

The current code also sets
yarn.nodemanager.linux-container-executor.cgroups.mount-path=/sys/fs/cgroup
and yarn.nodemanager.linux-container-executor.cgroups.mount=true, the
documentation the requires edits to yarn-site.xml to get these passed
through.

Now that I've got things working, I'll start cleaning up the original code
to provide an cleaner setup and adjust the documentation as necessary, I
should have a PR soon.

Re: cgroups suggestions

Posted by Darin Johnson <db...@gmail.com>.
Santosh, that is the behavior I'm seeing.
On May 4, 2016 6:13 PM, "Santosh Marella" <sm...@maprtech.com> wrote:

> > The second involves the cgroup hierarchy and the cgroup mount point.
> Here
> > the code attempts to create a hierarchy in $CGROUP_DIR/mesos/$TASK_ID.
> > This is problematic as mesos will not unmount the hierarchy when the task
> > finished (in this case the node manager)
>
> IIRC, when a task is launched by mesos, the agent creates
> $CGROUP_DIR/mesos/$TASK_ID mount point to enforce cpu/mem for that task.
> Once the task finishes, the agent should unmount the $TASK_ID. Are you
> saying
> that's not happening for NMs ?
>
> Santosh
>
> On Wed, May 4, 2016 at 10:30 AM, Darin Johnson <db...@gmail.com>
> wrote:
>
> > I've been digging into groups support, there's a few things that are easy
> > fixes but a few things become problematic so I'd like to discuss.
> >
> > First the code makes certain options dictated that can be placed in the
> > yarn-site.xml - this should be done to remove code and provide
> > flexibility.  That's easy.
> >
> > The second involves the cgroup hierarchy and the cgroup mount point.
> Here
> > the code attempts to create a hierarchy in $CGROUP_DIR/mesos/$TASK_ID.
> > This is problematic as mesos will not unmount the hierarchy when the task
> > finished (in this case the node manager), it is also therefore unable to
> > unmount it's own task hierarchy (This also creates the need to chmod a
> > number of directories as a superuser).  This leads to issues.  An
> > alternative approach would be to use the container-executor program
> > (already suid w/ yarn's group) to create the hierarchy as
> > $CGROUP_DIR/frameworkname if it doesn't exist, this may open another can
> of
> > worms as I haven't tested fully.
> >
> > Any thoughts or suggestions would be appreciated.
> >
> > Darin
> >
>

Re: cgroups suggestions

Posted by Santosh Marella <sm...@maprtech.com>.
> The second involves the cgroup hierarchy and the cgroup mount point.  Here
> the code attempts to create a hierarchy in $CGROUP_DIR/mesos/$TASK_ID.
> This is problematic as mesos will not unmount the hierarchy when the task
> finished (in this case the node manager)

IIRC, when a task is launched by mesos, the agent creates
$CGROUP_DIR/mesos/$TASK_ID mount point to enforce cpu/mem for that task.
Once the task finishes, the agent should unmount the $TASK_ID. Are you
saying
that's not happening for NMs ?

Santosh

On Wed, May 4, 2016 at 10:30 AM, Darin Johnson <db...@gmail.com>
wrote:

> I've been digging into groups support, there's a few things that are easy
> fixes but a few things become problematic so I'd like to discuss.
>
> First the code makes certain options dictated that can be placed in the
> yarn-site.xml - this should be done to remove code and provide
> flexibility.  That's easy.
>
> The second involves the cgroup hierarchy and the cgroup mount point.  Here
> the code attempts to create a hierarchy in $CGROUP_DIR/mesos/$TASK_ID.
> This is problematic as mesos will not unmount the hierarchy when the task
> finished (in this case the node manager), it is also therefore unable to
> unmount it's own task hierarchy (This also creates the need to chmod a
> number of directories as a superuser).  This leads to issues.  An
> alternative approach would be to use the container-executor program
> (already suid w/ yarn's group) to create the hierarchy as
> $CGROUP_DIR/frameworkname if it doesn't exist, this may open another can of
> worms as I haven't tested fully.
>
> Any thoughts or suggestions would be appreciated.
>
> Darin
>