You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Zhankun Tang (JIRA)" <ji...@apache.org> on 2018/10/21 04:44:00 UTC

[jira] [Comment Edited] (YARN-6729) NM percentage-physical-cpu-limit should be always 100 if DefaultLCEResourcesHandler is used

    [ https://issues.apache.org/jira/browse/YARN-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16658068#comment-16658068 ] 

Zhankun Tang edited comment on YARN-6729 at 10/21/18 4:43 AM:
--------------------------------------------------------------

[~yufeigu] , Thanks for reporting this. This a legacy configuration/code issue of CPU resource isolation. In YARN-3542, we involve _CGroupsCpuResourceHandlerImpl_ based on new _ResourceHandler_ mechanism but leaves the old configuration "yarn.nodemanager.linux-container-executor.resources-handler.class" there for a long time. Now it seems confusing sometimes.

Internally, the _CgroupsLCEResourcesHandler_ and _DefaultLCEResourcesHandler are_ both deprecated. YARN won't use them anymore.

Instead, YARN uses _CGroupsCpuResourceHandlerImpl_ to do CPU isolation and only in LCE. If we want to enforce CPU usage, we must set LCE and CgroupsLCEResourceHandler like this:
{noformat}
<property>
  <description>who will execute(launch) the containers.</description>
  <name>yarn.nodemanager.container-executor.class</name>
  <value>org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor</value>
</property>
<property>
 <description>The class which should help the LCE handle resources.</description>
 <name>yarn.nodemanager.linux-container-executor.resources-handler.class</name>
 <value>org.apache.hadoop.yarn.server.nodemanager.util.CgroupsLCEResourcesHandler<value>
 </property>{noformat}
Based on above settings can the "percentage-physical-cpu-limit" works as expected.

Back to your question, because the _DefaultLCEResourcesHandler_ setting cannot enable the LCE and CPU isolation anymore, related values (under 100) like "percentage-physical-cpu-limit"(default 100) won't work.

 I think two steps to make this more easy to the end user:
 # Refine the document about the dependency of above settings.
 # When users set legacy configuration like _DefaultLCEResourcesHandler,_ we throw exception.

Comments? 


was (Author: tangzhankun):
[~yufeigu] , Thanks for reporting this. This a legacy configuration/code issue of CPU resource isolation. In YARN-3542, we involve _CGroupsCpuResourceHandlerImpl_ based on new _ResourceHandler_ mechanism but leaves the old configuration "yarn.nodemanager.linux-container-executor.resources-handler.class" there for a long time. Now it seems confusing sometimes.

Internally, the _CgroupsLCEResourcesHandler_ and _DefaultLCEResourcesHandler are_ both deprecated. YARN won't use them anymore.

Instead, YARN uses _CGroupsCpuResourceHandlerImpl_ to do CPU isolation and only in LCE. If we want to enforce CPU usage, we must set LCE and CgroupsLCEResourceHandler like this:
{noformat}
<property>
  <description>who will execute(launch) the containers.</description>
  <name>yarn.nodemanager.container-executor.class</name>
  <value>org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor</value>
</property>
<property>
 <description>The class which should help the LCE handle resources.</description>
 <name>yarn.nodemanager.linux-container-executor.resources-handler.class</name>
 <value>org.apache.hadoop.yarn.server.nodemanager.util.CgroupsLCEResourcesHandler<value>
 </property>{noformat}
Based on above settings can the "percentage-physical-cpu-limit" works as expected.

Back to your question, because the _DefaultLCEResourcesHandler_ setting cannot enable the LCE and CPU isolation anymore, related values (under 100) like "percentage-physical-cpu-limit" won't work.

 I think two steps to make this more easy to the end user:
 # Refine the document about the dependency of above settings.
 # When users set legacy configuration like _DefaultLCEResourcesHandler,_ we throw exception.

Comments? 

> NM percentage-physical-cpu-limit should be always 100 if DefaultLCEResourcesHandler is used
> -------------------------------------------------------------------------------------------
>
>                 Key: YARN-6729
>                 URL: https://issues.apache.org/jira/browse/YARN-6729
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 3.0.0-alpha3
>            Reporter: Yufei Gu
>            Priority: Major
>
> NM percentage-physical-cpu-limit is not honored in DefaultLCEResourcesHandler, which may cause container cpu usage calculation issue. e.g. container vcore usage is potentially more than 100% if percentage-physical-cpu-limit is set to a value less than 100. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org