You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "yewei.huang (JIRA)" <ji...@apache.org> on 2018/07/12 03:57:00 UTC

[jira] [Comment Edited] (YARN-7064) Use cgroup to get container resource utilization

    [ https://issues.apache.org/jira/browse/YARN-7064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541092#comment-16541092 ] 

yewei.huang edited comment on YARN-7064 at 7/12/18 3:56 AM:
------------------------------------------------------------

{{Thanks [~miklos.szegedi@cloudera.com] for such nice feature!!!}}

{{As a newBee for yarn, I've got a little confusion on why we choose to add up (user + sys) time in cpuacct.stat rather than use cpuacct.usage when try to get total cpu usage ? }}

{{From the [kernel doc|https://www.kernel.org/doc/Documentation/cgroup-v1/cpuacct.txt]}}

cpuacct.usage gives the CPU time (in nanoseconds) obtained by this group which is essentially the CPU time obtained by all the tasks in the system.

cpuacct.stat  lists a few statistics which further divide the CPU time (in USER_HZ unit) obtained by the cgroup into user and system times. 

And it also mentioned cpuacct controller uses percpu_counter interface to collect user and system times. This has two side effects:
 * It is theoretically possible to see wrong values for user and system times. This is because percpu_counter_read() on 32bit systems isn't safe against concurrent writes.
 * It is possible to see slightly outdated values for user and system times due to the batch processing nature of percpu_counter.

 

seems much safer to use cpuacct.usage?

 


was (Author: windwizard):
{{Thanks [~miklos.szegedi@cloudera.com] for such nice feature!!!}}

{{As a newBee for yarn, I've got a little confusion on why we choose to add up (user + sys) time in cpuacct.stat rather than use cpuacct.usage when try to get total cpu usage ? }}

{{From the [kernel doc|https://www.kernel.org/doc/Documentation/cgroup-v1/cpuacct.txt]}}

{{cpuacct.usage }}gives the CPU time (in nanoseconds) obtained by this group which is essentially the CPU time obtained by all the tasks in the system.

cpuacct.stat  lists a few statistics which further divide the CPU time (in USER_HZ unit) obtained by the cgroup into user and system times. 

And it also mentioned cpuacct controller uses percpu_counter interface to collect user and system times. This has two side effects:
 * It is theoretically possible to see wrong values for user and system times. This is because percpu_counter_read() on 32bit systems isn't safe against concurrent writes.
 * It is possible to see slightly outdated values for user and system times due to the batch processing nature of percpu_counter.

 

seems much safer to use cpuacct.usage?

 

> Use cgroup to get container resource utilization
> ------------------------------------------------
>
>                 Key: YARN-7064
>                 URL: https://issues.apache.org/jira/browse/YARN-7064
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: nodemanager
>            Reporter: Miklos Szegedi
>            Assignee: Miklos Szegedi
>            Priority: Major
>             Fix For: 3.1.0
>
>         Attachments: YARN-7064.000.patch, YARN-7064.001.patch, YARN-7064.002.patch, YARN-7064.003.patch, YARN-7064.004.patch, YARN-7064.005.patch, YARN-7064.007.patch, YARN-7064.008.patch, YARN-7064.009.patch, YARN-7064.010.patch, YARN-7064.011.patch, YARN-7064.012.patch, YARN-7064.013.patch, YARN-7064.014.patch
>
>
> This is an addendum to YARN-6668. What happens is that that jira always wants to rebase patches against YARN-1011 instead of trunk.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org