You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Alejandro Abdelnur (JIRA)" <ji...@apache.org> on 2012/08/02 00:21:04 UTC
[jira] [Commented] (MAPREDUCE-4334) Add support for CPU isolation/monitoring of containers

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13426957#comment-13426957 ] 

Alejandro Abdelnur commented on MAPREDUCE-4334:
-----------------------------------------------

I was chatting offline with Arun about this JIRA. His key concern is that it should be possible to use cgroups without requiring the installation of additional packages and extra OS configuration. As the LinuxContainerExecutor already runs as root, we can leverage that to create the cgroup mounts. This means that the LinuxContainerExecutor is required to use cgroups with zero configuration. While typically the LinuxContainerExecutor is used in secure clusters, still it can be used in non-secure cluster always running as the mapred user (which would be the equivalent of the DefaultContainerExecutor).

Given this how about the following proposal?

This approach will not depend on cgexec binary being installed.

* The LinuxContainerExecutor would have 2 new options. 
** --cgroupsinit <PARAM..>: This option will be used for initialization. When invoked with this option, the LCE will create the cgroup mount point would and give owmership of it to the yarn user. Then it will complete its execution.
** --cgroup <PARAM>: This option will be used for launching containers. When invoked with this option, the LCE will add the process to specified cgroup paramerer.

* The ResourceEnforcer will have the following methods (exactly as in the latest patch):
** init(): called when the RM is initialized.
** preExecute(containerId, Resource): called before launching the container.
** wrapCommand(containerId, command): augments the execution command line before launching.
** postExecute(containerId): called after launching the container.

* A default implementation of the ResourceEnforcer will do NOPs.

* The CgroupsResourceEnforcer implementation will do the following:
** init(): call LCE --cgroupsinit
** preExecute(containerId, Resource): configure the cgroup with the assigned cpu resources.
** wrapCommand(containerId, command): augments regular LCE invocation with the -cgroup option.
** postExecute(containerId): any necessary cgroup clean up.
                
> Add support for CPU isolation/monitoring of containers
> ------------------------------------------------------
>
>                 Key: MAPREDUCE-4334
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4334
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>            Reporter: Arun C Murthy
>            Assignee: Andrew Ferguson
>         Attachments: MAPREDUCE-4334-executor-v1.patch, MAPREDUCE-4334-executor-v2.patch, MAPREDUCE-4334-executor-v3.patch, MAPREDUCE-4334-executor-v4.patch, MAPREDUCE-4334-pre1.patch, MAPREDUCE-4334-pre2-with_cpu.patch, MAPREDUCE-4334-pre2.patch, MAPREDUCE-4334-pre3-with_cpu.patch, MAPREDUCE-4334-pre3.patch, MAPREDUCE-4334-v1.patch, MAPREDUCE-4334-v2.patch, mapreduce-4334-design-doc-v2.txt, mapreduce-4334-design-doc.txt
>
>
> Once we get in MAPREDUCE-4327, it will be important to actually enforce limits on CPU consumption of containers. 
> Several options spring to mind:
> # taskset (RHEL5+)
> # cgroups (RHEL6+)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira