You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Alejandro Abdelnur (JIRA)" <ji...@apache.org> on 2012/10/06 01:12:02 UTC

[jira] [Created] (YARN-147) Add support for CPU isolation/monitoring of containers

Alejandro Abdelnur created YARN-147:
---------------------------------------

             Summary: Add support for CPU isolation/monitoring of containers
                 Key: YARN-147
                 URL: https://issues.apache.org/jira/browse/YARN-147
             Project: Hadoop YARN
          Issue Type: Bug
          Components: nodemanager
    Affects Versions: 2.0.3-alpha
            Reporter: Alejandro Abdelnur
            Assignee: Andrew Ferguson
             Fix For: 2.0.3-alpha
         Attachments: YARN-3.patch

This is a clone for YARN-3 to be able to submit the patch as YARN-3 does not show the SUBMIT PATCH button.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-147) Add support for CPU isolation/monitoring of containers

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/YARN-147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alejandro Abdelnur updated YARN-147:
------------------------------------

    Attachment: YARN-147-feedback.patch

patch with comments on top of original patch, all comments start with //TUCU: (for easy search).
                
> Add support for CPU isolation/monitoring of containers
> ------------------------------------------------------
>
>                 Key: YARN-147
>                 URL: https://issues.apache.org/jira/browse/YARN-147
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.0.3-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Andrew Ferguson
>             Fix For: 2.0.3-alpha
>
>         Attachments: YARN-147-feedback.patch, YARN-3.patch
>
>
> This is a clone for YARN-3 to be able to submit the patch as YARN-3 does not show the SUBMIT PATCH button.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-147) Add support for CPU isolation/monitoring of containers

Posted by "Andrew Ferguson (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/YARN-147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Ferguson updated YARN-147:
---------------------------------

    Attachment: YARN-147-v3.patch

update native code per review by Colin
                
> Add support for CPU isolation/monitoring of containers
> ------------------------------------------------------
>
>                 Key: YARN-147
>                 URL: https://issues.apache.org/jira/browse/YARN-147
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.0.3-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Andrew Ferguson
>             Fix For: 2.0.3-alpha
>
>         Attachments: YARN-147-v1.patch, YARN-147-v2.patch, YARN-147-v3.patch, YARN-3.patch
>
>
> This is a clone for YARN-3 to be able to submit the patch as YARN-3 does not show the SUBMIT PATCH button.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-147) Add support for CPU isolation/monitoring of containers

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/YARN-147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13477535#comment-13477535 ] 

Hadoop QA commented on YARN-147:
--------------------------------

{color:green}+1 overall{color}.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12549422/YARN-147-v2.patch
  against trunk revision .

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:green}+1 tests included{color}.  The patch appears to include 2 new or modified test files.

    {color:green}+1 javac{color}.  The applied patch does not increase the total number of javac compiler warnings.

    {color:green}+1 javadoc{color}.  The javadoc tool did not generate any warning messages.

    {color:green}+1 eclipse:eclipse{color}.  The patch built with eclipse:eclipse.

    {color:green}+1 findbugs{color}.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase the total number of release audit warnings.

    {color:green}+1 core tests{color}.  The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

    {color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-YARN-Build/93//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/93//console

This message is automatically generated.
                
> Add support for CPU isolation/monitoring of containers
> ------------------------------------------------------
>
>                 Key: YARN-147
>                 URL: https://issues.apache.org/jira/browse/YARN-147
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.0.3-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Andrew Ferguson
>             Fix For: 2.0.3-alpha
>
>         Attachments: YARN-147-v1.patch, YARN-147-v2.patch, YARN-3.patch
>
>
> This is a clone for YARN-3 to be able to submit the patch as YARN-3 does not show the SUBMIT PATCH button.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-147) Add support for CPU isolation/monitoring of containers

Posted by "Andrew Ferguson (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/YARN-147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Ferguson updated YARN-147:
---------------------------------

    Attachment: YARN-147-v2.patch

thanks for the additional comments, Tucu!  I've updated the patch as per your review. hopefully I have done everything correctly.

btw, I have this patch in github:
https://github.com/adferguson/hadoop-common/tree/adf-yarn-147
you can see the changes for this patch in the most recent commit.


thanks!
Andrew
                
> Add support for CPU isolation/monitoring of containers
> ------------------------------------------------------
>
>                 Key: YARN-147
>                 URL: https://issues.apache.org/jira/browse/YARN-147
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.0.3-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Andrew Ferguson
>             Fix For: 2.0.3-alpha
>
>         Attachments: YARN-147-v1.patch, YARN-147-v2.patch, YARN-3.patch
>
>
> This is a clone for YARN-3 to be able to submit the patch as YARN-3 does not show the SUBMIT PATCH button.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-147) Add support for CPU isolation/monitoring of containers

Posted by "Andrew Ferguson (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/YARN-147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Ferguson updated YARN-147:
---------------------------------

    Attachment: YARN-147-v1.patch

updated patch as per Tucu's review
                
> Add support for CPU isolation/monitoring of containers
> ------------------------------------------------------
>
>                 Key: YARN-147
>                 URL: https://issues.apache.org/jira/browse/YARN-147
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.0.3-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Andrew Ferguson
>             Fix For: 2.0.3-alpha
>
>         Attachments: YARN-147-v1.patch, YARN-3.patch
>
>
> This is a clone for YARN-3 to be able to submit the patch as YARN-3 does not show the SUBMIT PATCH button.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (YARN-147) Add support for CPU isolation/monitoring of containers

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/YARN-147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13470926#comment-13470926 ] 

Alejandro Abdelnur edited comment on YARN-147 at 10/6/12 8:14 PM:
------------------------------------------------------------------


*CgroupsLCEResourcesHandler*

a few lines exceed 80 chars, reformat

*CgroupsLCEResourcesHandler.setConf()*

move these assignments to init()
cgroupMount, default should be true, so the mounts are created if they don't exist
 cgroupMountPath, if there is no default we should fail if not set, can't we have a sensible default?
 
*CgroupsLCEResourcesHandler.pathForCgroup()*

default value for cgroupPrefix has '/', here will produce a '//' in the path

*CgroupsLCEResourcesHandler.updateCgroup()*

LOG.debug should be within an IF LOG.DebugEnabled BLOCK to avoid concatenation of the message if not logging.

The  writer close should be done in a finally after checking f != null. this is generating the findbugs warning

*CgroupsLCEResourcesHandler.parseMtab()*

Why not move the filereader constructor to the try block below

Nf the filereader cannot be open/read, is this acceptable or should stop execution by throwing exception?

No need to close fREader, closing in takes care of.

*container-executor.c mount_cgroup()*

Use an ELSE block to avoid return in the middle of the function.


                
      was (Author: tucu00):
    patch with comments on top of original patch, all comments start with //TUCU: (for easy search).
                  
> Add support for CPU isolation/monitoring of containers
> ------------------------------------------------------
>
>                 Key: YARN-147
>                 URL: https://issues.apache.org/jira/browse/YARN-147
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.0.3-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Andrew Ferguson
>             Fix For: 2.0.3-alpha
>
>         Attachments: YARN-3.patch
>
>
> This is a clone for YARN-3 to be able to submit the patch as YARN-3 does not show the SUBMIT PATCH button.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-147) Add support for CPU isolation/monitoring of containers

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/YARN-147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alejandro Abdelnur updated YARN-147:
------------------------------------

    Attachment: YARN-3.patch

Attaching last patch from YARN-3 from Andrew, had to tweak a few imports in LCE java that were not applying.
                
> Add support for CPU isolation/monitoring of containers
> ------------------------------------------------------
>
>                 Key: YARN-147
>                 URL: https://issues.apache.org/jira/browse/YARN-147
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.0.3-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Andrew Ferguson
>             Fix For: 2.0.3-alpha
>
>         Attachments: YARN-3.patch
>
>
> This is a clone for YARN-3 to be able to submit the patch as YARN-3 does not show the SUBMIT PATCH button.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-147) Add support for CPU isolation/monitoring of containers

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/YARN-147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13480475#comment-13480475 ] 

Alejandro Abdelnur commented on YARN-147:
-----------------------------------------

+1 pending jenkins (kicked it)
                
> Add support for CPU isolation/monitoring of containers
> ------------------------------------------------------
>
>                 Key: YARN-147
>                 URL: https://issues.apache.org/jira/browse/YARN-147
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.0.3-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Andrew Ferguson
>             Fix For: 2.0.3-alpha
>
>         Attachments: YARN-147-v1.patch, YARN-147-v2.patch, YARN-147-v3.patch, YARN-147-v4.patch, YARN-147-v5.patch, YARN-3.patch
>
>
> This is a clone for YARN-3 to be able to submit the patch as YARN-3 does not show the SUBMIT PATCH button.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-147) Add support for CPU isolation/monitoring of containers

Posted by "Colin Patrick McCabe (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/YARN-147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13480467#comment-13480467 ] 

Colin Patrick McCabe commented on YARN-147:
-------------------------------------------

Looks good to me.
                
> Add support for CPU isolation/monitoring of containers
> ------------------------------------------------------
>
>                 Key: YARN-147
>                 URL: https://issues.apache.org/jira/browse/YARN-147
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.0.3-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Andrew Ferguson
>             Fix For: 2.0.3-alpha
>
>         Attachments: YARN-147-v1.patch, YARN-147-v2.patch, YARN-147-v3.patch, YARN-147-v4.patch, YARN-147-v5.patch, YARN-3.patch
>
>
> This is a clone for YARN-3 to be able to submit the patch as YARN-3 does not show the SUBMIT PATCH button.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-147) Add support for CPU isolation/monitoring of containers

Posted by "Colin Patrick McCabe (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/YARN-147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13480319#comment-13480319 ] 

Colin Patrick McCabe commented on YARN-147:
-------------------------------------------

{{get_kv_key}}, {{get_kv_value}}: thanks for implementing the API I suggested.  It looks a lot better.  You might consider using {{strspn}} to simplify the code a bit (it returns a number of bytes, rather than a pointer, like {{index}} does.)  However, this is up to you (it's fine as-is).  It's more traditional to put the doxygen into the header file than the {{.c}} file.  Header files tend to define an interface, so that's where I go to find out how that interface should be used.

{{mount_cgroup}}: it looks like if {{result = -1}}, the call to {{fprintf}} will print out {{errno}}, which has not been set.  You could either set {{errno}} instead of {{result}} in the preceding code, or restructure the code a bit.
                
> Add support for CPU isolation/monitoring of containers
> ------------------------------------------------------
>
>                 Key: YARN-147
>                 URL: https://issues.apache.org/jira/browse/YARN-147
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.0.3-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Andrew Ferguson
>             Fix For: 2.0.3-alpha
>
>         Attachments: YARN-147-v1.patch, YARN-147-v2.patch, YARN-147-v3.patch, YARN-147-v4.patch, YARN-3.patch
>
>
> This is a clone for YARN-3 to be able to submit the patch as YARN-3 does not show the SUBMIT PATCH button.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-147) Add support for CPU isolation/monitoring of containers

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/YARN-147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alejandro Abdelnur updated YARN-147:
------------------------------------

    Attachment:     (was: YARN-147-feedback.patch)
    
> Add support for CPU isolation/monitoring of containers
> ------------------------------------------------------
>
>                 Key: YARN-147
>                 URL: https://issues.apache.org/jira/browse/YARN-147
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.0.3-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Andrew Ferguson
>             Fix For: 2.0.3-alpha
>
>         Attachments: YARN-3.patch
>
>
> This is a clone for YARN-3 to be able to submit the patch as YARN-3 does not show the SUBMIT PATCH button.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-147) Add support for CPU isolation/monitoring of containers

Posted by "Andrew Ferguson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/YARN-147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13483268#comment-13483268 ] 

Andrew Ferguson commented on YARN-147:
--------------------------------------

hi [~acmurthy], I've started posting replies on YARN-3 instead. the LCE bug is fixed and I'll post a new patch after addressing [~vinodkv]'s comments. thanks!
                
> Add support for CPU isolation/monitoring of containers
> ------------------------------------------------------
>
>                 Key: YARN-147
>                 URL: https://issues.apache.org/jira/browse/YARN-147
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.0.3-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Andrew Ferguson
>             Fix For: 2.0.3-alpha
>
>         Attachments: YARN-147-v1.patch, YARN-147-v2.patch, YARN-147-v3.patch, YARN-147-v4.patch, YARN-147-v5.patch, YARN-3.patch
>
>
> This is a clone for YARN-3 to be able to submit the patch as YARN-3 does not show the SUBMIT PATCH button.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-147) Add support for CPU isolation/monitoring of containers

Posted by "Andrew Ferguson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/YARN-147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13475909#comment-13475909 ] 

Andrew Ferguson commented on YARN-147:
--------------------------------------

hi Tucu,

thanks very much for opening this new jira and reviewing the patch. I've updated a new version which addresses most of your comments.

answers to the questions in your review:
.bq cgroupMountPath, if there is no default we should fail if not set, can't we have a sensible default?

I've added a check to fail if not set. as far as I can tell, there isn't a single default path for cgroups -- some distributions use "/sys/fs/cgroup", some use "/cgroup", others, "/cgroups". I've even seen "/mnt/cgroup" (Debian perhaps?); these also vary across releases of the same distro. :-(

.bq default value for cgroupPrefix has '/', here will produce a '//' in the path

yes, I made that choice deliberately. I wanted to convey that cgroupPrefix can be a path (which is why I kept the '/') and when I use it, I also added a '/' in case the user did not put a '/' at the right place in the prefix. my understanding is that on Unix, '//' in a path is interpreted as '/', no?

.bq Nf the filereader cannot be open/read, is this acceptable or should stop execution by throwing exception?

eh, we could go either way here, but I think it's reasonable to not throw the exception. if the file can't be read, then the map from cgroup controller to path isn't built, and we already have existing checks which skip controllers which can't be found in the path (say, if the file can be read correctly, but the CPU controller isn't mounted).


ok, great. I'm going to mark this as "patch available" and see if the findbugs warning has gone away (I can't seem to get it to run locally).


thanks!!
Andrew

                
> Add support for CPU isolation/monitoring of containers
> ------------------------------------------------------
>
>                 Key: YARN-147
>                 URL: https://issues.apache.org/jira/browse/YARN-147
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.0.3-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Andrew Ferguson
>             Fix For: 2.0.3-alpha
>
>         Attachments: YARN-147-v1.patch, YARN-3.patch
>
>
> This is a clone for YARN-3 to be able to submit the patch as YARN-3 does not show the SUBMIT PATCH button.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-147) Add support for CPU isolation/monitoring of containers

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/YARN-147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13470908#comment-13470908 ] 

Hadoop QA commented on YARN-147:
--------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12548069/YARN-3.patch
  against trunk revision .

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:green}+1 tests included{color}.  The patch appears to include 2 new or modified test files.

    {color:green}+1 javac{color}.  The applied patch does not increase the total number of javac compiler warnings.

    {color:green}+1 javadoc{color}.  The javadoc tool did not generate any warning messages.

    {color:green}+1 eclipse:eclipse{color}.  The patch built with eclipse:eclipse.

    {color:red}-1 findbugs{color}.  The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase the total number of release audit warnings.

    {color:green}+1 core tests{color}.  The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

    {color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-YARN-Build/73//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/73//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-nodemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/73//console

This message is automatically generated.
                
> Add support for CPU isolation/monitoring of containers
> ------------------------------------------------------
>
>                 Key: YARN-147
>                 URL: https://issues.apache.org/jira/browse/YARN-147
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.0.3-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Andrew Ferguson
>             Fix For: 2.0.3-alpha
>
>         Attachments: YARN-3.patch
>
>
> This is a clone for YARN-3 to be able to submit the patch as YARN-3 does not show the SUBMIT PATCH button.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-147) Add support for CPU isolation/monitoring of containers

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/YARN-147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13476268#comment-13476268 ] 

Alejandro Abdelnur commented on YARN-147:
-----------------------------------------

Andrew, thanks for updating the patch. Mostly minor stuff:

* CgroupsLCEResourcesHandler has an unused import for FileNotFoundException
* CgroupsLCEResourcesHandler, on the double '//', on the init() method, trim the '/' is present  at the beginning or end of the cgroupPrefix
* CgroupsLCEResourcesHandler, parseMtab, no need have a local var for the fReader, the FileReader creation could be done directly in the BufferedReader constructor.
* CgroupsLCEResourcesHandler, parseMtab, the last exception should print MTAB_FILE instead the BufferReader instance.
* CgroupsLCEResourcesHandler, on parseMtab error I think we should thrown an exception to halt things. My reasoning is that if I've configured things to use cgroups, I'd expect them to be working as opposed to have a warning message lost in the logs. This would help identify misconfigurations.

                
> Add support for CPU isolation/monitoring of containers
> ------------------------------------------------------
>
>                 Key: YARN-147
>                 URL: https://issues.apache.org/jira/browse/YARN-147
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.0.3-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Andrew Ferguson
>             Fix For: 2.0.3-alpha
>
>         Attachments: YARN-147-v1.patch, YARN-3.patch
>
>
> This is a clone for YARN-3 to be able to submit the patch as YARN-3 does not show the SUBMIT PATCH button.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-147) Add support for CPU isolation/monitoring of containers

Posted by "Siddharth Seth (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/YARN-147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13482582#comment-13482582 ] 

Siddharth Seth commented on YARN-147:
-------------------------------------

>From a quick look, I'm not sure if the LCE will continue to work without cgroups. ? LAUNCH_CONTAINER seems to want a "key=" as the resources parameter, and the DefaultLCEResourceHandler sends in a "none". 
                
> Add support for CPU isolation/monitoring of containers
> ------------------------------------------------------
>
>                 Key: YARN-147
>                 URL: https://issues.apache.org/jira/browse/YARN-147
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.0.3-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Andrew Ferguson
>             Fix For: 2.0.3-alpha
>
>         Attachments: YARN-147-v1.patch, YARN-147-v2.patch, YARN-147-v3.patch, YARN-147-v4.patch, YARN-147-v5.patch, YARN-3.patch
>
>
> This is a clone for YARN-3 to be able to submit the patch as YARN-3 does not show the SUBMIT PATCH button.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-147) Add support for CPU isolation/monitoring of containers

Posted by "Andrew Ferguson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/YARN-147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13480437#comment-13480437 ] 

Andrew Ferguson commented on YARN-147:
--------------------------------------

thanks again, Colin.  I'm going to stick with {{strchr}} instead of {{strspn}} because I think it's clearer that we get NULL when there is no = in the input string, rather than having to check if the length of the "key" is the same as the length of the string in that case.

thanks for noticing that the error handling wasn't quite right in {{mount_cgroup}}.

updated patch to follow in a sec.
                
> Add support for CPU isolation/monitoring of containers
> ------------------------------------------------------
>
>                 Key: YARN-147
>                 URL: https://issues.apache.org/jira/browse/YARN-147
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.0.3-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Andrew Ferguson
>             Fix For: 2.0.3-alpha
>
>         Attachments: YARN-147-v1.patch, YARN-147-v2.patch, YARN-147-v3.patch, YARN-147-v4.patch, YARN-3.patch
>
>
> This is a clone for YARN-3 to be able to submit the patch as YARN-3 does not show the SUBMIT PATCH button.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-147) Add support for CPU isolation/monitoring of containers

Posted by "Vinod Kumar Vavilapalli (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/YARN-147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13482751#comment-13482751 ] 

Vinod Kumar Vavilapalli commented on YARN-147:
----------------------------------------------

Got lost between YARN-3 and YARN-147 :)

Very nice to have patch, I am willing to get it in ASAP given the amount of time it's been around.

Some comments below.
 - yarn.nodemanager.linux-container-executor.cgroups.mount has different defaults in code and in yarn-default.xml
 - If the configs can be done away with (see below), ignore this comment. The descriptions for all the new configs in yarn-default.xml heavily reference code. We should simplify them to not address code and instead make them understandable by users and cross reference other related parameters.
 - {code}    // Based on testing, ApplicationMaster executables don't terminate until
    // a little after the container appears to have finished. Therefore, we
    // wait a short bit for the cgroup to become empty before deleting it.
   {code}
   Can you explain this? Is this sleep necessary. Depending on its importance, we'll need to fix the following Id check, AMs don't always have ID equaling one.
 - container-executor.c: If a mount-point is already mounted, mount gives a EBUSY error, mount_cgroup() will need to be fixed to support remounts (for e.g. on NM restarts). We could unmount cgroup fs on shutdown but that isn't always guaranteed.
 - Please update if you have tested it on a secure setup with LCE enabled with and without cgroups.

The following are already raised by others in some way, but I don't see them fixed in the latest patch. Unless I am missing something:
 - Not sure of the benefit of configurable yarn.nodemanager.linux-container-executor.cgroups.mount-path. Couldn't NM just always mount to a path that it creates and owns? Similar comment for the hierarchy-prefix.
 - CgroupsLCEResourcesHandler is swallowing exceptions and errors in multiple places - updateCgroup() and createCgroup(). In the later, if cgroups are enabled, and we can't create the file, it is a critical error?

One overarching improvement worth pursing immediately, either now or in follow up tickets:
 - Make ResourcesHandler top level. I'd like to merge the ContainersMonitor functionality with this so as to monitor/enforce memory limits also. ContainersMinotor is top-level, we should make ResourcesHandler also top-level so that other platforms don't need to create this type-hierarchy all over again when they wish to implement some or all of this functionality.
                
> Add support for CPU isolation/monitoring of containers
> ------------------------------------------------------
>
>                 Key: YARN-147
>                 URL: https://issues.apache.org/jira/browse/YARN-147
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.0.3-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Andrew Ferguson
>             Fix For: 2.0.3-alpha
>
>         Attachments: YARN-147-v1.patch, YARN-147-v2.patch, YARN-147-v3.patch, YARN-147-v4.patch, YARN-147-v5.patch, YARN-3.patch
>
>
> This is a clone for YARN-3 to be able to submit the patch as YARN-3 does not show the SUBMIT PATCH button.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-147) Add support for CPU isolation/monitoring of containers

Posted by "Andrew Ferguson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/YARN-147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13479387#comment-13479387 ] 

Andrew Ferguson commented on YARN-147:
--------------------------------------

hi Colin,

thanks for looking at the native code. since the changes were pretty extensive, would you mind taking a careful look again? if it's easier for you, the incremental changes can be seen here:
https://github.com/adferguson/hadoop-common/commits/adf-yarn-147

I hope I've faithfully implemented the new key-value API you suggested -- let me know if that's not the case.

If the mount fails, I let the exception bubble all the way up to stop the NodeManager, as Tucu suggested before about a different error.

The one thing I did not do is change the open / write / close methods to fopen / fprintf / fclose, as the rest of the native code does not use those methods. Which would you prefer to see: adjust my patch to use fopen, etc., or fix my use of open, etc.?

Yes, I totally agree that it would be better if main.c used getopt_long; it definitely smells like another JIRA to me. :-)


thanks!
Andrew
                
> Add support for CPU isolation/monitoring of containers
> ------------------------------------------------------
>
>                 Key: YARN-147
>                 URL: https://issues.apache.org/jira/browse/YARN-147
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.0.3-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Andrew Ferguson
>             Fix For: 2.0.3-alpha
>
>         Attachments: YARN-147-v1.patch, YARN-147-v2.patch, YARN-147-v3.patch, YARN-3.patch
>
>
> This is a clone for YARN-3 to be able to submit the patch as YARN-3 does not show the SUBMIT PATCH button.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-147) Add support for CPU isolation/monitoring of containers

Posted by "Andrew Ferguson (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/YARN-147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Ferguson updated YARN-147:
---------------------------------

    Attachment: YARN-147-v5.patch

updated patch with most recent suggestions from Colin.
                
> Add support for CPU isolation/monitoring of containers
> ------------------------------------------------------
>
>                 Key: YARN-147
>                 URL: https://issues.apache.org/jira/browse/YARN-147
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.0.3-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Andrew Ferguson
>             Fix For: 2.0.3-alpha
>
>         Attachments: YARN-147-v1.patch, YARN-147-v2.patch, YARN-147-v3.patch, YARN-147-v4.patch, YARN-147-v5.patch, YARN-3.patch
>
>
> This is a clone for YARN-3 to be able to submit the patch as YARN-3 does not show the SUBMIT PATCH button.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-147) Add support for CPU isolation/monitoring of containers

Posted by "Andrew Ferguson (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/YARN-147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Ferguson updated YARN-147:
---------------------------------

    Attachment: YARN-147-v8.patch
    
> Add support for CPU isolation/monitoring of containers
> ------------------------------------------------------
>
>                 Key: YARN-147
>                 URL: https://issues.apache.org/jira/browse/YARN-147
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.0.3-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Andrew Ferguson
>             Fix For: 2.0.3-alpha
>
>         Attachments: YARN-147-v1.patch, YARN-147-v2.patch, YARN-147-v3.patch, YARN-147-v4.patch, YARN-147-v5.patch, YARN-147-v6.patch, YARN-147-v8.patch, YARN-3.patch
>
>
> This is a clone for YARN-3 to be able to submit the patch as YARN-3 does not show the SUBMIT PATCH button.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-147) Add support for CPU isolation/monitoring of containers

Posted by "Colin Patrick McCabe (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/YARN-147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13478030#comment-13478030 ] 

Colin Patrick McCabe commented on YARN-147:
-------------------------------------------

Thanks for working on this, Andrew!  I took a quick look at the native part of this patch.

{{get_kv_key}}, {{get_kv_value}}: I would recommend that your API look something like this:
{code}
/**
 * If str is a string of the form key=val, find 'key'
 * 
 * @param input    The input string
 * @param out      Where to put the output string.
 * @param out_len  The length of the output buffer.
 *
 * @return         -ENAMETOOLONG if out_len is not long enough;
 *                 -EINVAL if there is no equals sign in the input;
 *                 0 on success
 */
int get_key(const char *input, char *out, size_t out_len);
{code}

This API doesn't require you to modify the input or allocate memory-- two things that make the code very hard to follow.  It gives the caller the choice of how much memory to allocate and how to allocate it (stack, malloc, etc.)

{code}
  char pid_buf[21];
  snprintf(pid_buf, 21, "%d", pid);
{code}

{{snprintf(pid_buf, sizeof(pid_buf), "%d", pid);}} would be considered better style.

More importantly, it would probably be better for you to use {{fopen}} / {{fprintf}} / {{fclose}} here.  If you do want to use the raw {{open}} / {{write}} / {{close}} methods, you have to correctly handle short writes and {{EINTR}}, which you are not doing here.

{{mount_cgroup}}: you really need to check the return code of {{mount}} and do something if it fails.

I would also recommend that you allocate {{hier_path}} on the stack with 
{code}
char hier_path[PATH_MAX]
{code}
or similar.  Using {{malloc}} here is not really worth it.

{code}
stpncpy(buf, hierarchy, strlen(hierarchy));
{code}

This seems to do the same thing here as the more widely available and portable {{strncpy}} function.  (You're not using the pointer which {{stpncpy}} returns.)  Better yet, use {{snprintf(buf, sizeof(buf), "%s", hierarchy);}} which will do bounds checking properly.

{{main.c}}: Just a general comment.  I really recommend using {{getopt_long}} to parse options.  I've seen manual option parsing grow into a huge mess over time as everyone adds "just one more option."  You don't have to do that in this change (although it would be easy enough to do), but please consider it in the future.
                
> Add support for CPU isolation/monitoring of containers
> ------------------------------------------------------
>
>                 Key: YARN-147
>                 URL: https://issues.apache.org/jira/browse/YARN-147
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.0.3-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Andrew Ferguson
>             Fix For: 2.0.3-alpha
>
>         Attachments: YARN-147-v1.patch, YARN-147-v2.patch, YARN-3.patch
>
>
> This is a clone for YARN-3 to be able to submit the patch as YARN-3 does not show the SUBMIT PATCH button.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-147) Add support for CPU isolation/monitoring of containers

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/YARN-147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13476220#comment-13476220 ] 

Hadoop QA commented on YARN-147:
--------------------------------

{color:green}+1 overall{color}.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12549094/YARN-147-v1.patch
  against trunk revision .

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:green}+1 tests included{color}.  The patch appears to include 2 new or modified test files.

    {color:green}+1 javac{color}.  The applied patch does not increase the total number of javac compiler warnings.

    {color:green}+1 javadoc{color}.  The javadoc tool did not generate any warning messages.

    {color:green}+1 eclipse:eclipse{color}.  The patch built with eclipse:eclipse.

    {color:green}+1 findbugs{color}.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase the total number of release audit warnings.

    {color:green}+1 core tests{color}.  The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

    {color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-YARN-Build/89//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/89//console

This message is automatically generated.
                
> Add support for CPU isolation/monitoring of containers
> ------------------------------------------------------
>
>                 Key: YARN-147
>                 URL: https://issues.apache.org/jira/browse/YARN-147
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.0.3-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Andrew Ferguson
>             Fix For: 2.0.3-alpha
>
>         Attachments: YARN-147-v1.patch, YARN-3.patch
>
>
> This is a clone for YARN-3 to be able to submit the patch as YARN-3 does not show the SUBMIT PATCH button.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-147) Add support for CPU isolation/monitoring of containers

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/YARN-147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13480504#comment-13480504 ] 

Hadoop QA commented on YARN-147:
--------------------------------

{color:green}+1 overall{color}.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12550094/YARN-147-v5.patch
  against trunk revision .

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:green}+1 tests included{color}.  The patch appears to include 2 new or modified test files.

    {color:green}+1 javac{color}.  The applied patch does not increase the total number of javac compiler warnings.

    {color:green}+1 javadoc{color}.  The javadoc tool did not generate any warning messages.

    {color:green}+1 eclipse:eclipse{color}.  The patch built with eclipse:eclipse.

    {color:green}+1 findbugs{color}.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase the total number of release audit warnings.

    {color:green}+1 core tests{color}.  The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

    {color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-YARN-Build/105//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/105//console

This message is automatically generated.
                
> Add support for CPU isolation/monitoring of containers
> ------------------------------------------------------
>
>                 Key: YARN-147
>                 URL: https://issues.apache.org/jira/browse/YARN-147
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.0.3-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Andrew Ferguson
>             Fix For: 2.0.3-alpha
>
>         Attachments: YARN-147-v1.patch, YARN-147-v2.patch, YARN-147-v3.patch, YARN-147-v4.patch, YARN-147-v5.patch, YARN-3.patch
>
>
> This is a clone for YARN-3 to be able to submit the patch as YARN-3 does not show the SUBMIT PATCH button.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-147) Add support for CPU isolation/monitoring of containers

Posted by "Andrew Ferguson (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/YARN-147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Ferguson updated YARN-147:
---------------------------------

    Attachment: YARN-147-v4.patch

small fix in two places: don't log & re-throw the same exception -- construct new exceptions with better context, and use the previous one as the cause.

thanks Tucu for pointing this out!
                
> Add support for CPU isolation/monitoring of containers
> ------------------------------------------------------
>
>                 Key: YARN-147
>                 URL: https://issues.apache.org/jira/browse/YARN-147
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.0.3-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Andrew Ferguson
>             Fix For: 2.0.3-alpha
>
>         Attachments: YARN-147-v1.patch, YARN-147-v2.patch, YARN-147-v3.patch, YARN-147-v4.patch, YARN-3.patch
>
>
> This is a clone for YARN-3 to be able to submit the patch as YARN-3 does not show the SUBMIT PATCH button.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-147) Add support for CPU isolation/monitoring of containers

Posted by "Siddharth Seth (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/YARN-147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13482670#comment-13482670 ] 

Siddharth Seth commented on YARN-147:
-------------------------------------

Andrew, could you please comment on testing with and without cgroups.

I don't know enough about cgroups to review the functionality. Otherwise, if this has been tested, I'm also +1 on getting this in (after resolving the previous comment), based on the previous reviews. Thanks!

Minor,
- LCEResourcesHandler, some of the function names (preExecute / postExecute) are a little tough to understand, without looking at where they're used. Would be nice to have JavaDocs / and maybe more verbose function names.
This is nice to have, and can be a separate jira.


                
> Add support for CPU isolation/monitoring of containers
> ------------------------------------------------------
>
>                 Key: YARN-147
>                 URL: https://issues.apache.org/jira/browse/YARN-147
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.0.3-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Andrew Ferguson
>             Fix For: 2.0.3-alpha
>
>         Attachments: YARN-147-v1.patch, YARN-147-v2.patch, YARN-147-v3.patch, YARN-147-v4.patch, YARN-147-v5.patch, YARN-3.patch
>
>
> This is a clone for YARN-3 to be able to submit the patch as YARN-3 does not show the SUBMIT PATCH button.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-147) Add support for CPU isolation/monitoring of containers

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/YARN-147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13483255#comment-13483255 ] 

Arun C Murthy commented on YARN-147:
------------------------------------

Cancelling patch while comments are addressed, particularly the one Sid raised - we can't break LCE.

Also, we need to make sure this continues to work on RHEL5/CentOS5 which doesn't have cgroups.

One more thing - can we please do reviews/discussions on YARN-3 to ensure we keep track in one place? Thanks.
                
> Add support for CPU isolation/monitoring of containers
> ------------------------------------------------------
>
>                 Key: YARN-147
>                 URL: https://issues.apache.org/jira/browse/YARN-147
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.0.3-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Andrew Ferguson
>             Fix For: 2.0.3-alpha
>
>         Attachments: YARN-147-v1.patch, YARN-147-v2.patch, YARN-147-v3.patch, YARN-147-v4.patch, YARN-147-v5.patch, YARN-3.patch
>
>
> This is a clone for YARN-3 to be able to submit the patch as YARN-3 does not show the SUBMIT PATCH button.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-147) Add support for CPU isolation/monitoring of containers

Posted by "Andrew Ferguson (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/YARN-147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Ferguson updated YARN-147:
---------------------------------

    Attachment: YARN-147-v6.patch

updated as per reviews on comments here and on YARN-3.
                
> Add support for CPU isolation/monitoring of containers
> ------------------------------------------------------
>
>                 Key: YARN-147
>                 URL: https://issues.apache.org/jira/browse/YARN-147
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.0.3-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Andrew Ferguson
>             Fix For: 2.0.3-alpha
>
>         Attachments: YARN-147-v1.patch, YARN-147-v2.patch, YARN-147-v3.patch, YARN-147-v4.patch, YARN-147-v5.patch, YARN-147-v6.patch, YARN-3.patch
>
>
> This is a clone for YARN-3 to be able to submit the patch as YARN-3 does not show the SUBMIT PATCH button.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira