You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by John Lilley <jo...@redpoint.net> on 2013/07/02 17:56:36 UTC

Containers and CPU

I have YARN tasks that benefit from multicore scaling.  However, they don't *always* use more than one core.  I would like to allocate containers based only on memory, and let each task use as many cores as needed, without allocating exclusive CPU "slots" in the scheduler.  For example, on an 8-core node with 16GB memory, I'd like to be able to run 3 tasks each consuming 4GB memory and each using as much CPU as they like.  Is this the default behavior if I don't specify CPU restrictions to the scheduler?
Thanks
John



RE: Containers and CPU

Posted by John Lilley <jo...@redpoint.net>.
To explain my reasoning, suppose that I have an application that performs some CPU-intensive calculation, and can scale to multiple cores internally, but it doesn't need those cores all the time because the CPU-intensive phase is only a part of the overall computation.  I'm not sure I understand cgroups' CPU control - does it statically mask cores available to processes, or does it set up a prioritization for access to all available cores?
Thanks,
John


From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: Tuesday, July 02, 2013 1:12 PM
To: user@hadoop.apache.org
Subject: RE: Containers and CPU

Sandy,
Sorry, I don't completely follow.
When you say "with cgroups on", is that an attribute of the AM, the Scheduler, or the Site/RM?  In other words is it site-wide or something that my application can control?
With cgroups on, is there still a way to get my desired behavior?  I'd really like all tasks to have access to all CPU cores and simply fight it out in the OS thread scheduler.
Thanks,
john

From: Sandy Ryza [mailto:sandy.ryza@cloudera.com]
Sent: Tuesday, July 02, 2013 11:56 AM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: Containers and CPU

CPU limits are only enforced if cgroups is turned on.  With cgroups on, they are only limited when there is contention, in which case tasks are given CPU time in proportion to the number of cores requested for/allocated to them.  Does that make sense?

-Sandy

On Tue, Jul 2, 2013 at 9:50 AM, Chuan Liu <ch...@microsoft.com>> wrote:
I believe this is the default behavior.
By default, only memory limit on resources is enforced.
The capacity scheduler will use DefaultResourceCalculator to compute resource allocation for containers by default, which also does not take CPU into account.

-Chuan

From: John Lilley [mailto:john.lilley@redpoint.net<ma...@redpoint.net>]
Sent: Tuesday, July 02, 2013 8:57 AM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Containers and CPU

I have YARN tasks that benefit from multicore scaling.  However, they don't *always* use more than one core.  I would like to allocate containers based only on memory, and let each task use as many cores as needed, without allocating exclusive CPU "slots" in the scheduler.  For example, on an 8-core node with 16GB memory, I'd like to be able to run 3 tasks each consuming 4GB memory and each using as much CPU as they like.  Is this the default behavior if I don't specify CPU restrictions to the scheduler?
Thanks
John




RE: Containers and CPU

Posted by John Lilley <jo...@redpoint.net>.
To explain my reasoning, suppose that I have an application that performs some CPU-intensive calculation, and can scale to multiple cores internally, but it doesn't need those cores all the time because the CPU-intensive phase is only a part of the overall computation.  I'm not sure I understand cgroups' CPU control - does it statically mask cores available to processes, or does it set up a prioritization for access to all available cores?
Thanks,
John


From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: Tuesday, July 02, 2013 1:12 PM
To: user@hadoop.apache.org
Subject: RE: Containers and CPU

Sandy,
Sorry, I don't completely follow.
When you say "with cgroups on", is that an attribute of the AM, the Scheduler, or the Site/RM?  In other words is it site-wide or something that my application can control?
With cgroups on, is there still a way to get my desired behavior?  I'd really like all tasks to have access to all CPU cores and simply fight it out in the OS thread scheduler.
Thanks,
john

From: Sandy Ryza [mailto:sandy.ryza@cloudera.com]
Sent: Tuesday, July 02, 2013 11:56 AM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: Containers and CPU

CPU limits are only enforced if cgroups is turned on.  With cgroups on, they are only limited when there is contention, in which case tasks are given CPU time in proportion to the number of cores requested for/allocated to them.  Does that make sense?

-Sandy

On Tue, Jul 2, 2013 at 9:50 AM, Chuan Liu <ch...@microsoft.com>> wrote:
I believe this is the default behavior.
By default, only memory limit on resources is enforced.
The capacity scheduler will use DefaultResourceCalculator to compute resource allocation for containers by default, which also does not take CPU into account.

-Chuan

From: John Lilley [mailto:john.lilley@redpoint.net<ma...@redpoint.net>]
Sent: Tuesday, July 02, 2013 8:57 AM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Containers and CPU

I have YARN tasks that benefit from multicore scaling.  However, they don't *always* use more than one core.  I would like to allocate containers based only on memory, and let each task use as many cores as needed, without allocating exclusive CPU "slots" in the scheduler.  For example, on an 8-core node with 16GB memory, I'd like to be able to run 3 tasks each consuming 4GB memory and each using as much CPU as they like.  Is this the default behavior if I don't specify CPU restrictions to the scheduler?
Thanks
John




RE: Containers and CPU

Posted by John Lilley <jo...@redpoint.net>.
To explain my reasoning, suppose that I have an application that performs some CPU-intensive calculation, and can scale to multiple cores internally, but it doesn't need those cores all the time because the CPU-intensive phase is only a part of the overall computation.  I'm not sure I understand cgroups' CPU control - does it statically mask cores available to processes, or does it set up a prioritization for access to all available cores?
Thanks,
John


From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: Tuesday, July 02, 2013 1:12 PM
To: user@hadoop.apache.org
Subject: RE: Containers and CPU

Sandy,
Sorry, I don't completely follow.
When you say "with cgroups on", is that an attribute of the AM, the Scheduler, or the Site/RM?  In other words is it site-wide or something that my application can control?
With cgroups on, is there still a way to get my desired behavior?  I'd really like all tasks to have access to all CPU cores and simply fight it out in the OS thread scheduler.
Thanks,
john

From: Sandy Ryza [mailto:sandy.ryza@cloudera.com]
Sent: Tuesday, July 02, 2013 11:56 AM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: Containers and CPU

CPU limits are only enforced if cgroups is turned on.  With cgroups on, they are only limited when there is contention, in which case tasks are given CPU time in proportion to the number of cores requested for/allocated to them.  Does that make sense?

-Sandy

On Tue, Jul 2, 2013 at 9:50 AM, Chuan Liu <ch...@microsoft.com>> wrote:
I believe this is the default behavior.
By default, only memory limit on resources is enforced.
The capacity scheduler will use DefaultResourceCalculator to compute resource allocation for containers by default, which also does not take CPU into account.

-Chuan

From: John Lilley [mailto:john.lilley@redpoint.net<ma...@redpoint.net>]
Sent: Tuesday, July 02, 2013 8:57 AM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Containers and CPU

I have YARN tasks that benefit from multicore scaling.  However, they don't *always* use more than one core.  I would like to allocate containers based only on memory, and let each task use as many cores as needed, without allocating exclusive CPU "slots" in the scheduler.  For example, on an 8-core node with 16GB memory, I'd like to be able to run 3 tasks each consuming 4GB memory and each using as much CPU as they like.  Is this the default behavior if I don't specify CPU restrictions to the scheduler?
Thanks
John




RE: Containers and CPU

Posted by John Lilley <jo...@redpoint.net>.
To explain my reasoning, suppose that I have an application that performs some CPU-intensive calculation, and can scale to multiple cores internally, but it doesn't need those cores all the time because the CPU-intensive phase is only a part of the overall computation.  I'm not sure I understand cgroups' CPU control - does it statically mask cores available to processes, or does it set up a prioritization for access to all available cores?
Thanks,
John


From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: Tuesday, July 02, 2013 1:12 PM
To: user@hadoop.apache.org
Subject: RE: Containers and CPU

Sandy,
Sorry, I don't completely follow.
When you say "with cgroups on", is that an attribute of the AM, the Scheduler, or the Site/RM?  In other words is it site-wide or something that my application can control?
With cgroups on, is there still a way to get my desired behavior?  I'd really like all tasks to have access to all CPU cores and simply fight it out in the OS thread scheduler.
Thanks,
john

From: Sandy Ryza [mailto:sandy.ryza@cloudera.com]
Sent: Tuesday, July 02, 2013 11:56 AM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: Containers and CPU

CPU limits are only enforced if cgroups is turned on.  With cgroups on, they are only limited when there is contention, in which case tasks are given CPU time in proportion to the number of cores requested for/allocated to them.  Does that make sense?

-Sandy

On Tue, Jul 2, 2013 at 9:50 AM, Chuan Liu <ch...@microsoft.com>> wrote:
I believe this is the default behavior.
By default, only memory limit on resources is enforced.
The capacity scheduler will use DefaultResourceCalculator to compute resource allocation for containers by default, which also does not take CPU into account.

-Chuan

From: John Lilley [mailto:john.lilley@redpoint.net<ma...@redpoint.net>]
Sent: Tuesday, July 02, 2013 8:57 AM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Containers and CPU

I have YARN tasks that benefit from multicore scaling.  However, they don't *always* use more than one core.  I would like to allocate containers based only on memory, and let each task use as many cores as needed, without allocating exclusive CPU "slots" in the scheduler.  For example, on an 8-core node with 16GB memory, I'd like to be able to run 3 tasks each consuming 4GB memory and each using as much CPU as they like.  Is this the default behavior if I don't specify CPU restrictions to the scheduler?
Thanks
John




Re: Containers and CPU

Posted by Sandy Ryza <sa...@cloudera.com>.
That's correct.

-Sandy


On Tue, Jul 2, 2013 at 12:28 PM, John Lilley <jo...@redpoint.net>wrote:

>  Sandy,****
>
> Thanks, I think I understand.  So it only makes a difference if cgroups is
> on AND the AM requests multiple cores?  E.g. if each task wants 4 cores the
> RM would only allow two containers per 8-core node?****
>
> John****
>
> ** **
>
> ** **
>
> *From:* Sandy Ryza [mailto:sandy.ryza@cloudera.com]
> *Sent:* Tuesday, July 02, 2013 1:26 PM
>
> *To:* user@hadoop.apache.org
> *Subject:* Re: Containers and CPU****
>
> ** **
>
> Use of cgroups for controlling CPU is off by default, but can be turned on
> as a nodemanager configuration with
> yarn.nodemanager.linux-container-executor.resources-handler.class.  So it
> is site-wide.  If you want tasks to purely fight it out in the OS thread
> scheduler, simply don't change from the default.****
>
> ** **
>
> Even with cgroups on, all tasks will have access to all CPU cores.  We
> don't do any pinning of tasks to cores.  If a task is requested with a
> single vcore and placed on an otherwise empty machine with 8 cores, it will
> have access to all 8 cores.  If 3 other tasks that requested a single vcore
> are later placed on the same node, and all tasks are using as much CPU as
> they can get their hands on, then each of the tasks will get 2 cores of
> CPU-time.****
>
> ** **
>
> On Tue, Jul 2, 2013 at 12:12 PM, John Lilley <jo...@redpoint.net>
> wrote:****
>
> Sandy,****
>
> Sorry, I don’t completely follow.  ****
>
> When you say “with cgroups on”, is that an attribute of the AM, the
> Scheduler, or the Site/RM?  In other words is it site-wide or something
> that my application can control?****
>
> With cgroups on, is there still a way to get my desired behavior?  I’d
> really like all tasks to have access to all CPU cores and simply fight it
> out in the OS thread scheduler.****
>
> Thanks,****
>
> john****
>
>  ****
>
> *From:* Sandy Ryza [mailto:sandy.ryza@cloudera.com]
> *Sent:* Tuesday, July 02, 2013 11:56 AM
> *To:* user@hadoop.apache.org
> *Subject:* Re: Containers and CPU****
>
>  ****
>
> CPU limits are only enforced if cgroups is turned on.  With cgroups on,
> they are only limited when there is contention, in which case tasks are
> given CPU time in proportion to the number of cores requested for/allocated
> to them.  Does that make sense?****
>
>  ****
>
> -Sandy****
>
>  ****
>
> On Tue, Jul 2, 2013 at 9:50 AM, Chuan Liu <ch...@microsoft.com> wrote:*
> ***
>
> I believe this is the default behavior.****
>
> By default, only memory limit on resources is enforced.****
>
> The capacity scheduler will use DefaultResourceCalculator to compute
> resource allocation for containers by default, which also does not take CPU
> into account.****
>
>  ****
>
> -Chuan****
>
>  ****
>
> *From:* John Lilley [mailto:john.lilley@redpoint.net]
> *Sent:* Tuesday, July 02, 2013 8:57 AM
> *To:* user@hadoop.apache.org
> *Subject:* Containers and CPU****
>
>  ****
>
> I have YARN tasks that benefit from multicore scaling.  However, they
> don’t **always** use more than one core.  I would like to allocate
> containers based only on memory, and let each task use as many cores as
> needed, without allocating exclusive CPU “slots” in the scheduler.  For
> example, on an 8-core node with 16GB memory, I’d like to be able to run 3
> tasks each consuming 4GB memory and each using as much CPU as they like.
> Is this the default behavior if I don’t specify CPU restrictions to the
> scheduler?****
>
> Thanks****
>
> John****
>
>  ****
>
>  ****
>
>  ****
>
> ** **
>

Re: Containers and CPU

Posted by Sandy Ryza <sa...@cloudera.com>.
That's correct.

-Sandy


On Tue, Jul 2, 2013 at 12:28 PM, John Lilley <jo...@redpoint.net>wrote:

>  Sandy,****
>
> Thanks, I think I understand.  So it only makes a difference if cgroups is
> on AND the AM requests multiple cores?  E.g. if each task wants 4 cores the
> RM would only allow two containers per 8-core node?****
>
> John****
>
> ** **
>
> ** **
>
> *From:* Sandy Ryza [mailto:sandy.ryza@cloudera.com]
> *Sent:* Tuesday, July 02, 2013 1:26 PM
>
> *To:* user@hadoop.apache.org
> *Subject:* Re: Containers and CPU****
>
> ** **
>
> Use of cgroups for controlling CPU is off by default, but can be turned on
> as a nodemanager configuration with
> yarn.nodemanager.linux-container-executor.resources-handler.class.  So it
> is site-wide.  If you want tasks to purely fight it out in the OS thread
> scheduler, simply don't change from the default.****
>
> ** **
>
> Even with cgroups on, all tasks will have access to all CPU cores.  We
> don't do any pinning of tasks to cores.  If a task is requested with a
> single vcore and placed on an otherwise empty machine with 8 cores, it will
> have access to all 8 cores.  If 3 other tasks that requested a single vcore
> are later placed on the same node, and all tasks are using as much CPU as
> they can get their hands on, then each of the tasks will get 2 cores of
> CPU-time.****
>
> ** **
>
> On Tue, Jul 2, 2013 at 12:12 PM, John Lilley <jo...@redpoint.net>
> wrote:****
>
> Sandy,****
>
> Sorry, I don’t completely follow.  ****
>
> When you say “with cgroups on”, is that an attribute of the AM, the
> Scheduler, or the Site/RM?  In other words is it site-wide or something
> that my application can control?****
>
> With cgroups on, is there still a way to get my desired behavior?  I’d
> really like all tasks to have access to all CPU cores and simply fight it
> out in the OS thread scheduler.****
>
> Thanks,****
>
> john****
>
>  ****
>
> *From:* Sandy Ryza [mailto:sandy.ryza@cloudera.com]
> *Sent:* Tuesday, July 02, 2013 11:56 AM
> *To:* user@hadoop.apache.org
> *Subject:* Re: Containers and CPU****
>
>  ****
>
> CPU limits are only enforced if cgroups is turned on.  With cgroups on,
> they are only limited when there is contention, in which case tasks are
> given CPU time in proportion to the number of cores requested for/allocated
> to them.  Does that make sense?****
>
>  ****
>
> -Sandy****
>
>  ****
>
> On Tue, Jul 2, 2013 at 9:50 AM, Chuan Liu <ch...@microsoft.com> wrote:*
> ***
>
> I believe this is the default behavior.****
>
> By default, only memory limit on resources is enforced.****
>
> The capacity scheduler will use DefaultResourceCalculator to compute
> resource allocation for containers by default, which also does not take CPU
> into account.****
>
>  ****
>
> -Chuan****
>
>  ****
>
> *From:* John Lilley [mailto:john.lilley@redpoint.net]
> *Sent:* Tuesday, July 02, 2013 8:57 AM
> *To:* user@hadoop.apache.org
> *Subject:* Containers and CPU****
>
>  ****
>
> I have YARN tasks that benefit from multicore scaling.  However, they
> don’t **always** use more than one core.  I would like to allocate
> containers based only on memory, and let each task use as many cores as
> needed, without allocating exclusive CPU “slots” in the scheduler.  For
> example, on an 8-core node with 16GB memory, I’d like to be able to run 3
> tasks each consuming 4GB memory and each using as much CPU as they like.
> Is this the default behavior if I don’t specify CPU restrictions to the
> scheduler?****
>
> Thanks****
>
> John****
>
>  ****
>
>  ****
>
>  ****
>
> ** **
>

Re: Containers and CPU

Posted by Sandy Ryza <sa...@cloudera.com>.
That's correct.

-Sandy


On Tue, Jul 2, 2013 at 12:28 PM, John Lilley <jo...@redpoint.net>wrote:

>  Sandy,****
>
> Thanks, I think I understand.  So it only makes a difference if cgroups is
> on AND the AM requests multiple cores?  E.g. if each task wants 4 cores the
> RM would only allow two containers per 8-core node?****
>
> John****
>
> ** **
>
> ** **
>
> *From:* Sandy Ryza [mailto:sandy.ryza@cloudera.com]
> *Sent:* Tuesday, July 02, 2013 1:26 PM
>
> *To:* user@hadoop.apache.org
> *Subject:* Re: Containers and CPU****
>
> ** **
>
> Use of cgroups for controlling CPU is off by default, but can be turned on
> as a nodemanager configuration with
> yarn.nodemanager.linux-container-executor.resources-handler.class.  So it
> is site-wide.  If you want tasks to purely fight it out in the OS thread
> scheduler, simply don't change from the default.****
>
> ** **
>
> Even with cgroups on, all tasks will have access to all CPU cores.  We
> don't do any pinning of tasks to cores.  If a task is requested with a
> single vcore and placed on an otherwise empty machine with 8 cores, it will
> have access to all 8 cores.  If 3 other tasks that requested a single vcore
> are later placed on the same node, and all tasks are using as much CPU as
> they can get their hands on, then each of the tasks will get 2 cores of
> CPU-time.****
>
> ** **
>
> On Tue, Jul 2, 2013 at 12:12 PM, John Lilley <jo...@redpoint.net>
> wrote:****
>
> Sandy,****
>
> Sorry, I don’t completely follow.  ****
>
> When you say “with cgroups on”, is that an attribute of the AM, the
> Scheduler, or the Site/RM?  In other words is it site-wide or something
> that my application can control?****
>
> With cgroups on, is there still a way to get my desired behavior?  I’d
> really like all tasks to have access to all CPU cores and simply fight it
> out in the OS thread scheduler.****
>
> Thanks,****
>
> john****
>
>  ****
>
> *From:* Sandy Ryza [mailto:sandy.ryza@cloudera.com]
> *Sent:* Tuesday, July 02, 2013 11:56 AM
> *To:* user@hadoop.apache.org
> *Subject:* Re: Containers and CPU****
>
>  ****
>
> CPU limits are only enforced if cgroups is turned on.  With cgroups on,
> they are only limited when there is contention, in which case tasks are
> given CPU time in proportion to the number of cores requested for/allocated
> to them.  Does that make sense?****
>
>  ****
>
> -Sandy****
>
>  ****
>
> On Tue, Jul 2, 2013 at 9:50 AM, Chuan Liu <ch...@microsoft.com> wrote:*
> ***
>
> I believe this is the default behavior.****
>
> By default, only memory limit on resources is enforced.****
>
> The capacity scheduler will use DefaultResourceCalculator to compute
> resource allocation for containers by default, which also does not take CPU
> into account.****
>
>  ****
>
> -Chuan****
>
>  ****
>
> *From:* John Lilley [mailto:john.lilley@redpoint.net]
> *Sent:* Tuesday, July 02, 2013 8:57 AM
> *To:* user@hadoop.apache.org
> *Subject:* Containers and CPU****
>
>  ****
>
> I have YARN tasks that benefit from multicore scaling.  However, they
> don’t **always** use more than one core.  I would like to allocate
> containers based only on memory, and let each task use as many cores as
> needed, without allocating exclusive CPU “slots” in the scheduler.  For
> example, on an 8-core node with 16GB memory, I’d like to be able to run 3
> tasks each consuming 4GB memory and each using as much CPU as they like.
> Is this the default behavior if I don’t specify CPU restrictions to the
> scheduler?****
>
> Thanks****
>
> John****
>
>  ****
>
>  ****
>
>  ****
>
> ** **
>

Re: Containers and CPU

Posted by Sandy Ryza <sa...@cloudera.com>.
That's correct.

-Sandy


On Tue, Jul 2, 2013 at 12:28 PM, John Lilley <jo...@redpoint.net>wrote:

>  Sandy,****
>
> Thanks, I think I understand.  So it only makes a difference if cgroups is
> on AND the AM requests multiple cores?  E.g. if each task wants 4 cores the
> RM would only allow two containers per 8-core node?****
>
> John****
>
> ** **
>
> ** **
>
> *From:* Sandy Ryza [mailto:sandy.ryza@cloudera.com]
> *Sent:* Tuesday, July 02, 2013 1:26 PM
>
> *To:* user@hadoop.apache.org
> *Subject:* Re: Containers and CPU****
>
> ** **
>
> Use of cgroups for controlling CPU is off by default, but can be turned on
> as a nodemanager configuration with
> yarn.nodemanager.linux-container-executor.resources-handler.class.  So it
> is site-wide.  If you want tasks to purely fight it out in the OS thread
> scheduler, simply don't change from the default.****
>
> ** **
>
> Even with cgroups on, all tasks will have access to all CPU cores.  We
> don't do any pinning of tasks to cores.  If a task is requested with a
> single vcore and placed on an otherwise empty machine with 8 cores, it will
> have access to all 8 cores.  If 3 other tasks that requested a single vcore
> are later placed on the same node, and all tasks are using as much CPU as
> they can get their hands on, then each of the tasks will get 2 cores of
> CPU-time.****
>
> ** **
>
> On Tue, Jul 2, 2013 at 12:12 PM, John Lilley <jo...@redpoint.net>
> wrote:****
>
> Sandy,****
>
> Sorry, I don’t completely follow.  ****
>
> When you say “with cgroups on”, is that an attribute of the AM, the
> Scheduler, or the Site/RM?  In other words is it site-wide or something
> that my application can control?****
>
> With cgroups on, is there still a way to get my desired behavior?  I’d
> really like all tasks to have access to all CPU cores and simply fight it
> out in the OS thread scheduler.****
>
> Thanks,****
>
> john****
>
>  ****
>
> *From:* Sandy Ryza [mailto:sandy.ryza@cloudera.com]
> *Sent:* Tuesday, July 02, 2013 11:56 AM
> *To:* user@hadoop.apache.org
> *Subject:* Re: Containers and CPU****
>
>  ****
>
> CPU limits are only enforced if cgroups is turned on.  With cgroups on,
> they are only limited when there is contention, in which case tasks are
> given CPU time in proportion to the number of cores requested for/allocated
> to them.  Does that make sense?****
>
>  ****
>
> -Sandy****
>
>  ****
>
> On Tue, Jul 2, 2013 at 9:50 AM, Chuan Liu <ch...@microsoft.com> wrote:*
> ***
>
> I believe this is the default behavior.****
>
> By default, only memory limit on resources is enforced.****
>
> The capacity scheduler will use DefaultResourceCalculator to compute
> resource allocation for containers by default, which also does not take CPU
> into account.****
>
>  ****
>
> -Chuan****
>
>  ****
>
> *From:* John Lilley [mailto:john.lilley@redpoint.net]
> *Sent:* Tuesday, July 02, 2013 8:57 AM
> *To:* user@hadoop.apache.org
> *Subject:* Containers and CPU****
>
>  ****
>
> I have YARN tasks that benefit from multicore scaling.  However, they
> don’t **always** use more than one core.  I would like to allocate
> containers based only on memory, and let each task use as many cores as
> needed, without allocating exclusive CPU “slots” in the scheduler.  For
> example, on an 8-core node with 16GB memory, I’d like to be able to run 3
> tasks each consuming 4GB memory and each using as much CPU as they like.
> Is this the default behavior if I don’t specify CPU restrictions to the
> scheduler?****
>
> Thanks****
>
> John****
>
>  ****
>
>  ****
>
>  ****
>
> ** **
>

RE: Containers and CPU

Posted by John Lilley <jo...@redpoint.net>.
Sandy,
Thanks, I think I understand.  So it only makes a difference if cgroups is on AND the AM requests multiple cores?  E.g. if each task wants 4 cores the RM would only allow two containers per 8-core node?
John


From: Sandy Ryza [mailto:sandy.ryza@cloudera.com]
Sent: Tuesday, July 02, 2013 1:26 PM
To: user@hadoop.apache.org
Subject: Re: Containers and CPU

Use of cgroups for controlling CPU is off by default, but can be turned on as a nodemanager configuration with yarn.nodemanager.linux-container-executor.resources-handler.class.  So it is site-wide.  If you want tasks to purely fight it out in the OS thread scheduler, simply don't change from the default.

Even with cgroups on, all tasks will have access to all CPU cores.  We don't do any pinning of tasks to cores.  If a task is requested with a single vcore and placed on an otherwise empty machine with 8 cores, it will have access to all 8 cores.  If 3 other tasks that requested a single vcore are later placed on the same node, and all tasks are using as much CPU as they can get their hands on, then each of the tasks will get 2 cores of CPU-time.

On Tue, Jul 2, 2013 at 12:12 PM, John Lilley <jo...@redpoint.net>> wrote:
Sandy,
Sorry, I don't completely follow.
When you say "with cgroups on", is that an attribute of the AM, the Scheduler, or the Site/RM?  In other words is it site-wide or something that my application can control?
With cgroups on, is there still a way to get my desired behavior?  I'd really like all tasks to have access to all CPU cores and simply fight it out in the OS thread scheduler.
Thanks,
john

From: Sandy Ryza [mailto:sandy.ryza@cloudera.com<ma...@cloudera.com>]
Sent: Tuesday, July 02, 2013 11:56 AM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: Containers and CPU

CPU limits are only enforced if cgroups is turned on.  With cgroups on, they are only limited when there is contention, in which case tasks are given CPU time in proportion to the number of cores requested for/allocated to them.  Does that make sense?

-Sandy

On Tue, Jul 2, 2013 at 9:50 AM, Chuan Liu <ch...@microsoft.com>> wrote:
I believe this is the default behavior.
By default, only memory limit on resources is enforced.
The capacity scheduler will use DefaultResourceCalculator to compute resource allocation for containers by default, which also does not take CPU into account.

-Chuan

From: John Lilley [mailto:john.lilley@redpoint.net<ma...@redpoint.net>]
Sent: Tuesday, July 02, 2013 8:57 AM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Containers and CPU

I have YARN tasks that benefit from multicore scaling.  However, they don't *always* use more than one core.  I would like to allocate containers based only on memory, and let each task use as many cores as needed, without allocating exclusive CPU "slots" in the scheduler.  For example, on an 8-core node with 16GB memory, I'd like to be able to run 3 tasks each consuming 4GB memory and each using as much CPU as they like.  Is this the default behavior if I don't specify CPU restrictions to the scheduler?
Thanks
John





RE: Containers and CPU

Posted by John Lilley <jo...@redpoint.net>.
Sandy,
Thanks, I think I understand.  So it only makes a difference if cgroups is on AND the AM requests multiple cores?  E.g. if each task wants 4 cores the RM would only allow two containers per 8-core node?
John


From: Sandy Ryza [mailto:sandy.ryza@cloudera.com]
Sent: Tuesday, July 02, 2013 1:26 PM
To: user@hadoop.apache.org
Subject: Re: Containers and CPU

Use of cgroups for controlling CPU is off by default, but can be turned on as a nodemanager configuration with yarn.nodemanager.linux-container-executor.resources-handler.class.  So it is site-wide.  If you want tasks to purely fight it out in the OS thread scheduler, simply don't change from the default.

Even with cgroups on, all tasks will have access to all CPU cores.  We don't do any pinning of tasks to cores.  If a task is requested with a single vcore and placed on an otherwise empty machine with 8 cores, it will have access to all 8 cores.  If 3 other tasks that requested a single vcore are later placed on the same node, and all tasks are using as much CPU as they can get their hands on, then each of the tasks will get 2 cores of CPU-time.

On Tue, Jul 2, 2013 at 12:12 PM, John Lilley <jo...@redpoint.net>> wrote:
Sandy,
Sorry, I don't completely follow.
When you say "with cgroups on", is that an attribute of the AM, the Scheduler, or the Site/RM?  In other words is it site-wide or something that my application can control?
With cgroups on, is there still a way to get my desired behavior?  I'd really like all tasks to have access to all CPU cores and simply fight it out in the OS thread scheduler.
Thanks,
john

From: Sandy Ryza [mailto:sandy.ryza@cloudera.com<ma...@cloudera.com>]
Sent: Tuesday, July 02, 2013 11:56 AM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: Containers and CPU

CPU limits are only enforced if cgroups is turned on.  With cgroups on, they are only limited when there is contention, in which case tasks are given CPU time in proportion to the number of cores requested for/allocated to them.  Does that make sense?

-Sandy

On Tue, Jul 2, 2013 at 9:50 AM, Chuan Liu <ch...@microsoft.com>> wrote:
I believe this is the default behavior.
By default, only memory limit on resources is enforced.
The capacity scheduler will use DefaultResourceCalculator to compute resource allocation for containers by default, which also does not take CPU into account.

-Chuan

From: John Lilley [mailto:john.lilley@redpoint.net<ma...@redpoint.net>]
Sent: Tuesday, July 02, 2013 8:57 AM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Containers and CPU

I have YARN tasks that benefit from multicore scaling.  However, they don't *always* use more than one core.  I would like to allocate containers based only on memory, and let each task use as many cores as needed, without allocating exclusive CPU "slots" in the scheduler.  For example, on an 8-core node with 16GB memory, I'd like to be able to run 3 tasks each consuming 4GB memory and each using as much CPU as they like.  Is this the default behavior if I don't specify CPU restrictions to the scheduler?
Thanks
John





RE: Containers and CPU

Posted by John Lilley <jo...@redpoint.net>.
Sandy,
Thanks, I think I understand.  So it only makes a difference if cgroups is on AND the AM requests multiple cores?  E.g. if each task wants 4 cores the RM would only allow two containers per 8-core node?
John


From: Sandy Ryza [mailto:sandy.ryza@cloudera.com]
Sent: Tuesday, July 02, 2013 1:26 PM
To: user@hadoop.apache.org
Subject: Re: Containers and CPU

Use of cgroups for controlling CPU is off by default, but can be turned on as a nodemanager configuration with yarn.nodemanager.linux-container-executor.resources-handler.class.  So it is site-wide.  If you want tasks to purely fight it out in the OS thread scheduler, simply don't change from the default.

Even with cgroups on, all tasks will have access to all CPU cores.  We don't do any pinning of tasks to cores.  If a task is requested with a single vcore and placed on an otherwise empty machine with 8 cores, it will have access to all 8 cores.  If 3 other tasks that requested a single vcore are later placed on the same node, and all tasks are using as much CPU as they can get their hands on, then each of the tasks will get 2 cores of CPU-time.

On Tue, Jul 2, 2013 at 12:12 PM, John Lilley <jo...@redpoint.net>> wrote:
Sandy,
Sorry, I don't completely follow.
When you say "with cgroups on", is that an attribute of the AM, the Scheduler, or the Site/RM?  In other words is it site-wide or something that my application can control?
With cgroups on, is there still a way to get my desired behavior?  I'd really like all tasks to have access to all CPU cores and simply fight it out in the OS thread scheduler.
Thanks,
john

From: Sandy Ryza [mailto:sandy.ryza@cloudera.com<ma...@cloudera.com>]
Sent: Tuesday, July 02, 2013 11:56 AM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: Containers and CPU

CPU limits are only enforced if cgroups is turned on.  With cgroups on, they are only limited when there is contention, in which case tasks are given CPU time in proportion to the number of cores requested for/allocated to them.  Does that make sense?

-Sandy

On Tue, Jul 2, 2013 at 9:50 AM, Chuan Liu <ch...@microsoft.com>> wrote:
I believe this is the default behavior.
By default, only memory limit on resources is enforced.
The capacity scheduler will use DefaultResourceCalculator to compute resource allocation for containers by default, which also does not take CPU into account.

-Chuan

From: John Lilley [mailto:john.lilley@redpoint.net<ma...@redpoint.net>]
Sent: Tuesday, July 02, 2013 8:57 AM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Containers and CPU

I have YARN tasks that benefit from multicore scaling.  However, they don't *always* use more than one core.  I would like to allocate containers based only on memory, and let each task use as many cores as needed, without allocating exclusive CPU "slots" in the scheduler.  For example, on an 8-core node with 16GB memory, I'd like to be able to run 3 tasks each consuming 4GB memory and each using as much CPU as they like.  Is this the default behavior if I don't specify CPU restrictions to the scheduler?
Thanks
John





RE: Containers and CPU

Posted by John Lilley <jo...@redpoint.net>.
Sandy,
Thanks, I think I understand.  So it only makes a difference if cgroups is on AND the AM requests multiple cores?  E.g. if each task wants 4 cores the RM would only allow two containers per 8-core node?
John


From: Sandy Ryza [mailto:sandy.ryza@cloudera.com]
Sent: Tuesday, July 02, 2013 1:26 PM
To: user@hadoop.apache.org
Subject: Re: Containers and CPU

Use of cgroups for controlling CPU is off by default, but can be turned on as a nodemanager configuration with yarn.nodemanager.linux-container-executor.resources-handler.class.  So it is site-wide.  If you want tasks to purely fight it out in the OS thread scheduler, simply don't change from the default.

Even with cgroups on, all tasks will have access to all CPU cores.  We don't do any pinning of tasks to cores.  If a task is requested with a single vcore and placed on an otherwise empty machine with 8 cores, it will have access to all 8 cores.  If 3 other tasks that requested a single vcore are later placed on the same node, and all tasks are using as much CPU as they can get their hands on, then each of the tasks will get 2 cores of CPU-time.

On Tue, Jul 2, 2013 at 12:12 PM, John Lilley <jo...@redpoint.net>> wrote:
Sandy,
Sorry, I don't completely follow.
When you say "with cgroups on", is that an attribute of the AM, the Scheduler, or the Site/RM?  In other words is it site-wide or something that my application can control?
With cgroups on, is there still a way to get my desired behavior?  I'd really like all tasks to have access to all CPU cores and simply fight it out in the OS thread scheduler.
Thanks,
john

From: Sandy Ryza [mailto:sandy.ryza@cloudera.com<ma...@cloudera.com>]
Sent: Tuesday, July 02, 2013 11:56 AM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: Containers and CPU

CPU limits are only enforced if cgroups is turned on.  With cgroups on, they are only limited when there is contention, in which case tasks are given CPU time in proportion to the number of cores requested for/allocated to them.  Does that make sense?

-Sandy

On Tue, Jul 2, 2013 at 9:50 AM, Chuan Liu <ch...@microsoft.com>> wrote:
I believe this is the default behavior.
By default, only memory limit on resources is enforced.
The capacity scheduler will use DefaultResourceCalculator to compute resource allocation for containers by default, which also does not take CPU into account.

-Chuan

From: John Lilley [mailto:john.lilley@redpoint.net<ma...@redpoint.net>]
Sent: Tuesday, July 02, 2013 8:57 AM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Containers and CPU

I have YARN tasks that benefit from multicore scaling.  However, they don't *always* use more than one core.  I would like to allocate containers based only on memory, and let each task use as many cores as needed, without allocating exclusive CPU "slots" in the scheduler.  For example, on an 8-core node with 16GB memory, I'd like to be able to run 3 tasks each consuming 4GB memory and each using as much CPU as they like.  Is this the default behavior if I don't specify CPU restrictions to the scheduler?
Thanks
John





Re: Containers and CPU

Posted by Sandy Ryza <sa...@cloudera.com>.
Use of cgroups for controlling CPU is off by default, but can be turned on
as a nodemanager configuration with
yarn.nodemanager.linux-container-executor.resources-handler.class.  So it
is site-wide.  If you want tasks to purely fight it out in the OS thread
scheduler, simply don't change from the default.

Even with cgroups on, all tasks will have access to all CPU cores.  We
don't do any pinning of tasks to cores.  If a task is requested with a
single vcore and placed on an otherwise empty machine with 8 cores, it will
have access to all 8 cores.  If 3 other tasks that requested a single vcore
are later placed on the same node, and all tasks are using as much CPU as
they can get their hands on, then each of the tasks will get 2 cores of
CPU-time.


On Tue, Jul 2, 2013 at 12:12 PM, John Lilley <jo...@redpoint.net>wrote:

>  Sandy,****
>
> Sorry, I don’t completely follow.  ****
>
> When you say “with cgroups on”, is that an attribute of the AM, the
> Scheduler, or the Site/RM?  In other words is it site-wide or something
> that my application can control?****
>
> With cgroups on, is there still a way to get my desired behavior?  I’d
> really like all tasks to have access to all CPU cores and simply fight it
> out in the OS thread scheduler.****
>
> Thanks,****
>
> john****
>
> ** **
>
> *From:* Sandy Ryza [mailto:sandy.ryza@cloudera.com]
> *Sent:* Tuesday, July 02, 2013 11:56 AM
> *To:* user@hadoop.apache.org
> *Subject:* Re: Containers and CPU****
>
> ** **
>
> CPU limits are only enforced if cgroups is turned on.  With cgroups on,
> they are only limited when there is contention, in which case tasks are
> given CPU time in proportion to the number of cores requested for/allocated
> to them.  Does that make sense?****
>
> ** **
>
> -Sandy****
>
> ** **
>
> On Tue, Jul 2, 2013 at 9:50 AM, Chuan Liu <ch...@microsoft.com> wrote:*
> ***
>
> I believe this is the default behavior.****
>
> By default, only memory limit on resources is enforced.****
>
> The capacity scheduler will use DefaultResourceCalculator to compute
> resource allocation for containers by default, which also does not take CPU
> into account.****
>
>  ****
>
> -Chuan****
>
>  ****
>
> *From:* John Lilley [mailto:john.lilley@redpoint.net]
> *Sent:* Tuesday, July 02, 2013 8:57 AM
> *To:* user@hadoop.apache.org
> *Subject:* Containers and CPU****
>
>  ****
>
> I have YARN tasks that benefit from multicore scaling.  However, they
> don’t **always** use more than one core.  I would like to allocate
> containers based only on memory, and let each task use as many cores as
> needed, without allocating exclusive CPU “slots” in the scheduler.  For
> example, on an 8-core node with 16GB memory, I’d like to be able to run 3
> tasks each consuming 4GB memory and each using as much CPU as they like.
> Is this the default behavior if I don’t specify CPU restrictions to the
> scheduler?****
>
> Thanks****
>
> John****
>
>  ****
>
>  ****
>
> ** **
>

Re: Containers and CPU

Posted by Sandy Ryza <sa...@cloudera.com>.
Use of cgroups for controlling CPU is off by default, but can be turned on
as a nodemanager configuration with
yarn.nodemanager.linux-container-executor.resources-handler.class.  So it
is site-wide.  If you want tasks to purely fight it out in the OS thread
scheduler, simply don't change from the default.

Even with cgroups on, all tasks will have access to all CPU cores.  We
don't do any pinning of tasks to cores.  If a task is requested with a
single vcore and placed on an otherwise empty machine with 8 cores, it will
have access to all 8 cores.  If 3 other tasks that requested a single vcore
are later placed on the same node, and all tasks are using as much CPU as
they can get their hands on, then each of the tasks will get 2 cores of
CPU-time.


On Tue, Jul 2, 2013 at 12:12 PM, John Lilley <jo...@redpoint.net>wrote:

>  Sandy,****
>
> Sorry, I don’t completely follow.  ****
>
> When you say “with cgroups on”, is that an attribute of the AM, the
> Scheduler, or the Site/RM?  In other words is it site-wide or something
> that my application can control?****
>
> With cgroups on, is there still a way to get my desired behavior?  I’d
> really like all tasks to have access to all CPU cores and simply fight it
> out in the OS thread scheduler.****
>
> Thanks,****
>
> john****
>
> ** **
>
> *From:* Sandy Ryza [mailto:sandy.ryza@cloudera.com]
> *Sent:* Tuesday, July 02, 2013 11:56 AM
> *To:* user@hadoop.apache.org
> *Subject:* Re: Containers and CPU****
>
> ** **
>
> CPU limits are only enforced if cgroups is turned on.  With cgroups on,
> they are only limited when there is contention, in which case tasks are
> given CPU time in proportion to the number of cores requested for/allocated
> to them.  Does that make sense?****
>
> ** **
>
> -Sandy****
>
> ** **
>
> On Tue, Jul 2, 2013 at 9:50 AM, Chuan Liu <ch...@microsoft.com> wrote:*
> ***
>
> I believe this is the default behavior.****
>
> By default, only memory limit on resources is enforced.****
>
> The capacity scheduler will use DefaultResourceCalculator to compute
> resource allocation for containers by default, which also does not take CPU
> into account.****
>
>  ****
>
> -Chuan****
>
>  ****
>
> *From:* John Lilley [mailto:john.lilley@redpoint.net]
> *Sent:* Tuesday, July 02, 2013 8:57 AM
> *To:* user@hadoop.apache.org
> *Subject:* Containers and CPU****
>
>  ****
>
> I have YARN tasks that benefit from multicore scaling.  However, they
> don’t **always** use more than one core.  I would like to allocate
> containers based only on memory, and let each task use as many cores as
> needed, without allocating exclusive CPU “slots” in the scheduler.  For
> example, on an 8-core node with 16GB memory, I’d like to be able to run 3
> tasks each consuming 4GB memory and each using as much CPU as they like.
> Is this the default behavior if I don’t specify CPU restrictions to the
> scheduler?****
>
> Thanks****
>
> John****
>
>  ****
>
>  ****
>
> ** **
>

Re: Containers and CPU

Posted by Sandy Ryza <sa...@cloudera.com>.
Use of cgroups for controlling CPU is off by default, but can be turned on
as a nodemanager configuration with
yarn.nodemanager.linux-container-executor.resources-handler.class.  So it
is site-wide.  If you want tasks to purely fight it out in the OS thread
scheduler, simply don't change from the default.

Even with cgroups on, all tasks will have access to all CPU cores.  We
don't do any pinning of tasks to cores.  If a task is requested with a
single vcore and placed on an otherwise empty machine with 8 cores, it will
have access to all 8 cores.  If 3 other tasks that requested a single vcore
are later placed on the same node, and all tasks are using as much CPU as
they can get their hands on, then each of the tasks will get 2 cores of
CPU-time.


On Tue, Jul 2, 2013 at 12:12 PM, John Lilley <jo...@redpoint.net>wrote:

>  Sandy,****
>
> Sorry, I don’t completely follow.  ****
>
> When you say “with cgroups on”, is that an attribute of the AM, the
> Scheduler, or the Site/RM?  In other words is it site-wide or something
> that my application can control?****
>
> With cgroups on, is there still a way to get my desired behavior?  I’d
> really like all tasks to have access to all CPU cores and simply fight it
> out in the OS thread scheduler.****
>
> Thanks,****
>
> john****
>
> ** **
>
> *From:* Sandy Ryza [mailto:sandy.ryza@cloudera.com]
> *Sent:* Tuesday, July 02, 2013 11:56 AM
> *To:* user@hadoop.apache.org
> *Subject:* Re: Containers and CPU****
>
> ** **
>
> CPU limits are only enforced if cgroups is turned on.  With cgroups on,
> they are only limited when there is contention, in which case tasks are
> given CPU time in proportion to the number of cores requested for/allocated
> to them.  Does that make sense?****
>
> ** **
>
> -Sandy****
>
> ** **
>
> On Tue, Jul 2, 2013 at 9:50 AM, Chuan Liu <ch...@microsoft.com> wrote:*
> ***
>
> I believe this is the default behavior.****
>
> By default, only memory limit on resources is enforced.****
>
> The capacity scheduler will use DefaultResourceCalculator to compute
> resource allocation for containers by default, which also does not take CPU
> into account.****
>
>  ****
>
> -Chuan****
>
>  ****
>
> *From:* John Lilley [mailto:john.lilley@redpoint.net]
> *Sent:* Tuesday, July 02, 2013 8:57 AM
> *To:* user@hadoop.apache.org
> *Subject:* Containers and CPU****
>
>  ****
>
> I have YARN tasks that benefit from multicore scaling.  However, they
> don’t **always** use more than one core.  I would like to allocate
> containers based only on memory, and let each task use as many cores as
> needed, without allocating exclusive CPU “slots” in the scheduler.  For
> example, on an 8-core node with 16GB memory, I’d like to be able to run 3
> tasks each consuming 4GB memory and each using as much CPU as they like.
> Is this the default behavior if I don’t specify CPU restrictions to the
> scheduler?****
>
> Thanks****
>
> John****
>
>  ****
>
>  ****
>
> ** **
>

Re: Containers and CPU

Posted by Sandy Ryza <sa...@cloudera.com>.
Use of cgroups for controlling CPU is off by default, but can be turned on
as a nodemanager configuration with
yarn.nodemanager.linux-container-executor.resources-handler.class.  So it
is site-wide.  If you want tasks to purely fight it out in the OS thread
scheduler, simply don't change from the default.

Even with cgroups on, all tasks will have access to all CPU cores.  We
don't do any pinning of tasks to cores.  If a task is requested with a
single vcore and placed on an otherwise empty machine with 8 cores, it will
have access to all 8 cores.  If 3 other tasks that requested a single vcore
are later placed on the same node, and all tasks are using as much CPU as
they can get their hands on, then each of the tasks will get 2 cores of
CPU-time.


On Tue, Jul 2, 2013 at 12:12 PM, John Lilley <jo...@redpoint.net>wrote:

>  Sandy,****
>
> Sorry, I don’t completely follow.  ****
>
> When you say “with cgroups on”, is that an attribute of the AM, the
> Scheduler, or the Site/RM?  In other words is it site-wide or something
> that my application can control?****
>
> With cgroups on, is there still a way to get my desired behavior?  I’d
> really like all tasks to have access to all CPU cores and simply fight it
> out in the OS thread scheduler.****
>
> Thanks,****
>
> john****
>
> ** **
>
> *From:* Sandy Ryza [mailto:sandy.ryza@cloudera.com]
> *Sent:* Tuesday, July 02, 2013 11:56 AM
> *To:* user@hadoop.apache.org
> *Subject:* Re: Containers and CPU****
>
> ** **
>
> CPU limits are only enforced if cgroups is turned on.  With cgroups on,
> they are only limited when there is contention, in which case tasks are
> given CPU time in proportion to the number of cores requested for/allocated
> to them.  Does that make sense?****
>
> ** **
>
> -Sandy****
>
> ** **
>
> On Tue, Jul 2, 2013 at 9:50 AM, Chuan Liu <ch...@microsoft.com> wrote:*
> ***
>
> I believe this is the default behavior.****
>
> By default, only memory limit on resources is enforced.****
>
> The capacity scheduler will use DefaultResourceCalculator to compute
> resource allocation for containers by default, which also does not take CPU
> into account.****
>
>  ****
>
> -Chuan****
>
>  ****
>
> *From:* John Lilley [mailto:john.lilley@redpoint.net]
> *Sent:* Tuesday, July 02, 2013 8:57 AM
> *To:* user@hadoop.apache.org
> *Subject:* Containers and CPU****
>
>  ****
>
> I have YARN tasks that benefit from multicore scaling.  However, they
> don’t **always** use more than one core.  I would like to allocate
> containers based only on memory, and let each task use as many cores as
> needed, without allocating exclusive CPU “slots” in the scheduler.  For
> example, on an 8-core node with 16GB memory, I’d like to be able to run 3
> tasks each consuming 4GB memory and each using as much CPU as they like.
> Is this the default behavior if I don’t specify CPU restrictions to the
> scheduler?****
>
> Thanks****
>
> John****
>
>  ****
>
>  ****
>
> ** **
>

RE: Containers and CPU

Posted by John Lilley <jo...@redpoint.net>.
Sandy,
Sorry, I don't completely follow.
When you say "with cgroups on", is that an attribute of the AM, the Scheduler, or the Site/RM?  In other words is it site-wide or something that my application can control?
With cgroups on, is there still a way to get my desired behavior?  I'd really like all tasks to have access to all CPU cores and simply fight it out in the OS thread scheduler.
Thanks,
john

From: Sandy Ryza [mailto:sandy.ryza@cloudera.com]
Sent: Tuesday, July 02, 2013 11:56 AM
To: user@hadoop.apache.org
Subject: Re: Containers and CPU

CPU limits are only enforced if cgroups is turned on.  With cgroups on, they are only limited when there is contention, in which case tasks are given CPU time in proportion to the number of cores requested for/allocated to them.  Does that make sense?

-Sandy

On Tue, Jul 2, 2013 at 9:50 AM, Chuan Liu <ch...@microsoft.com>> wrote:
I believe this is the default behavior.
By default, only memory limit on resources is enforced.
The capacity scheduler will use DefaultResourceCalculator to compute resource allocation for containers by default, which also does not take CPU into account.

-Chuan

From: John Lilley [mailto:john.lilley@redpoint.net<ma...@redpoint.net>]
Sent: Tuesday, July 02, 2013 8:57 AM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Containers and CPU

I have YARN tasks that benefit from multicore scaling.  However, they don't *always* use more than one core.  I would like to allocate containers based only on memory, and let each task use as many cores as needed, without allocating exclusive CPU "slots" in the scheduler.  For example, on an 8-core node with 16GB memory, I'd like to be able to run 3 tasks each consuming 4GB memory and each using as much CPU as they like.  Is this the default behavior if I don't specify CPU restrictions to the scheduler?
Thanks
John




RE: Containers and CPU

Posted by John Lilley <jo...@redpoint.net>.
Sandy,
Sorry, I don't completely follow.
When you say "with cgroups on", is that an attribute of the AM, the Scheduler, or the Site/RM?  In other words is it site-wide or something that my application can control?
With cgroups on, is there still a way to get my desired behavior?  I'd really like all tasks to have access to all CPU cores and simply fight it out in the OS thread scheduler.
Thanks,
john

From: Sandy Ryza [mailto:sandy.ryza@cloudera.com]
Sent: Tuesday, July 02, 2013 11:56 AM
To: user@hadoop.apache.org
Subject: Re: Containers and CPU

CPU limits are only enforced if cgroups is turned on.  With cgroups on, they are only limited when there is contention, in which case tasks are given CPU time in proportion to the number of cores requested for/allocated to them.  Does that make sense?

-Sandy

On Tue, Jul 2, 2013 at 9:50 AM, Chuan Liu <ch...@microsoft.com>> wrote:
I believe this is the default behavior.
By default, only memory limit on resources is enforced.
The capacity scheduler will use DefaultResourceCalculator to compute resource allocation for containers by default, which also does not take CPU into account.

-Chuan

From: John Lilley [mailto:john.lilley@redpoint.net<ma...@redpoint.net>]
Sent: Tuesday, July 02, 2013 8:57 AM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Containers and CPU

I have YARN tasks that benefit from multicore scaling.  However, they don't *always* use more than one core.  I would like to allocate containers based only on memory, and let each task use as many cores as needed, without allocating exclusive CPU "slots" in the scheduler.  For example, on an 8-core node with 16GB memory, I'd like to be able to run 3 tasks each consuming 4GB memory and each using as much CPU as they like.  Is this the default behavior if I don't specify CPU restrictions to the scheduler?
Thanks
John




RE: Containers and CPU

Posted by John Lilley <jo...@redpoint.net>.
Sandy,
Sorry, I don't completely follow.
When you say "with cgroups on", is that an attribute of the AM, the Scheduler, or the Site/RM?  In other words is it site-wide or something that my application can control?
With cgroups on, is there still a way to get my desired behavior?  I'd really like all tasks to have access to all CPU cores and simply fight it out in the OS thread scheduler.
Thanks,
john

From: Sandy Ryza [mailto:sandy.ryza@cloudera.com]
Sent: Tuesday, July 02, 2013 11:56 AM
To: user@hadoop.apache.org
Subject: Re: Containers and CPU

CPU limits are only enforced if cgroups is turned on.  With cgroups on, they are only limited when there is contention, in which case tasks are given CPU time in proportion to the number of cores requested for/allocated to them.  Does that make sense?

-Sandy

On Tue, Jul 2, 2013 at 9:50 AM, Chuan Liu <ch...@microsoft.com>> wrote:
I believe this is the default behavior.
By default, only memory limit on resources is enforced.
The capacity scheduler will use DefaultResourceCalculator to compute resource allocation for containers by default, which also does not take CPU into account.

-Chuan

From: John Lilley [mailto:john.lilley@redpoint.net<ma...@redpoint.net>]
Sent: Tuesday, July 02, 2013 8:57 AM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Containers and CPU

I have YARN tasks that benefit from multicore scaling.  However, they don't *always* use more than one core.  I would like to allocate containers based only on memory, and let each task use as many cores as needed, without allocating exclusive CPU "slots" in the scheduler.  For example, on an 8-core node with 16GB memory, I'd like to be able to run 3 tasks each consuming 4GB memory and each using as much CPU as they like.  Is this the default behavior if I don't specify CPU restrictions to the scheduler?
Thanks
John




RE: Containers and CPU

Posted by John Lilley <jo...@redpoint.net>.
Sandy,
Sorry, I don't completely follow.
When you say "with cgroups on", is that an attribute of the AM, the Scheduler, or the Site/RM?  In other words is it site-wide or something that my application can control?
With cgroups on, is there still a way to get my desired behavior?  I'd really like all tasks to have access to all CPU cores and simply fight it out in the OS thread scheduler.
Thanks,
john

From: Sandy Ryza [mailto:sandy.ryza@cloudera.com]
Sent: Tuesday, July 02, 2013 11:56 AM
To: user@hadoop.apache.org
Subject: Re: Containers and CPU

CPU limits are only enforced if cgroups is turned on.  With cgroups on, they are only limited when there is contention, in which case tasks are given CPU time in proportion to the number of cores requested for/allocated to them.  Does that make sense?

-Sandy

On Tue, Jul 2, 2013 at 9:50 AM, Chuan Liu <ch...@microsoft.com>> wrote:
I believe this is the default behavior.
By default, only memory limit on resources is enforced.
The capacity scheduler will use DefaultResourceCalculator to compute resource allocation for containers by default, which also does not take CPU into account.

-Chuan

From: John Lilley [mailto:john.lilley@redpoint.net<ma...@redpoint.net>]
Sent: Tuesday, July 02, 2013 8:57 AM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Containers and CPU

I have YARN tasks that benefit from multicore scaling.  However, they don't *always* use more than one core.  I would like to allocate containers based only on memory, and let each task use as many cores as needed, without allocating exclusive CPU "slots" in the scheduler.  For example, on an 8-core node with 16GB memory, I'd like to be able to run 3 tasks each consuming 4GB memory and each using as much CPU as they like.  Is this the default behavior if I don't specify CPU restrictions to the scheduler?
Thanks
John




Re: Containers and CPU

Posted by Sandy Ryza <sa...@cloudera.com>.
CPU limits are only enforced if cgroups is turned on.  With cgroups on,
they are only limited when there is contention, in which case tasks are
given CPU time in proportion to the number of cores requested for/allocated
to them.  Does that make sense?

-Sandy


On Tue, Jul 2, 2013 at 9:50 AM, Chuan Liu <ch...@microsoft.com> wrote:

>  I believe this is the default behavior.****
>
> By default, only memory limit on resources is enforced.****
>
> The capacity scheduler will use DefaultResourceCalculator to compute
> resource allocation for containers by default, which also does not take CPU
> into account.****
>
> ** **
>
> -Chuan****
>
> ** **
>
> *From:* John Lilley [mailto:john.lilley@redpoint.net]
> *Sent:* Tuesday, July 02, 2013 8:57 AM
> *To:* user@hadoop.apache.org
> *Subject:* Containers and CPU****
>
> ** **
>
> I have YARN tasks that benefit from multicore scaling.  However, they
> don’t **always** use more than one core.  I would like to allocate
> containers based only on memory, and let each task use as many cores as
> needed, without allocating exclusive CPU “slots” in the scheduler.  For
> example, on an 8-core node with 16GB memory, I’d like to be able to run 3
> tasks each consuming 4GB memory and each using as much CPU as they like.
> Is this the default behavior if I don’t specify CPU restrictions to the
> scheduler?****
>
> Thanks****
>
> John****
>
> ** **
>
> ** **
>

Re: Containers and CPU

Posted by Sandy Ryza <sa...@cloudera.com>.
CPU limits are only enforced if cgroups is turned on.  With cgroups on,
they are only limited when there is contention, in which case tasks are
given CPU time in proportion to the number of cores requested for/allocated
to them.  Does that make sense?

-Sandy


On Tue, Jul 2, 2013 at 9:50 AM, Chuan Liu <ch...@microsoft.com> wrote:

>  I believe this is the default behavior.****
>
> By default, only memory limit on resources is enforced.****
>
> The capacity scheduler will use DefaultResourceCalculator to compute
> resource allocation for containers by default, which also does not take CPU
> into account.****
>
> ** **
>
> -Chuan****
>
> ** **
>
> *From:* John Lilley [mailto:john.lilley@redpoint.net]
> *Sent:* Tuesday, July 02, 2013 8:57 AM
> *To:* user@hadoop.apache.org
> *Subject:* Containers and CPU****
>
> ** **
>
> I have YARN tasks that benefit from multicore scaling.  However, they
> don’t **always** use more than one core.  I would like to allocate
> containers based only on memory, and let each task use as many cores as
> needed, without allocating exclusive CPU “slots” in the scheduler.  For
> example, on an 8-core node with 16GB memory, I’d like to be able to run 3
> tasks each consuming 4GB memory and each using as much CPU as they like.
> Is this the default behavior if I don’t specify CPU restrictions to the
> scheduler?****
>
> Thanks****
>
> John****
>
> ** **
>
> ** **
>

Re: Containers and CPU

Posted by Sandy Ryza <sa...@cloudera.com>.
CPU limits are only enforced if cgroups is turned on.  With cgroups on,
they are only limited when there is contention, in which case tasks are
given CPU time in proportion to the number of cores requested for/allocated
to them.  Does that make sense?

-Sandy


On Tue, Jul 2, 2013 at 9:50 AM, Chuan Liu <ch...@microsoft.com> wrote:

>  I believe this is the default behavior.****
>
> By default, only memory limit on resources is enforced.****
>
> The capacity scheduler will use DefaultResourceCalculator to compute
> resource allocation for containers by default, which also does not take CPU
> into account.****
>
> ** **
>
> -Chuan****
>
> ** **
>
> *From:* John Lilley [mailto:john.lilley@redpoint.net]
> *Sent:* Tuesday, July 02, 2013 8:57 AM
> *To:* user@hadoop.apache.org
> *Subject:* Containers and CPU****
>
> ** **
>
> I have YARN tasks that benefit from multicore scaling.  However, they
> don’t **always** use more than one core.  I would like to allocate
> containers based only on memory, and let each task use as many cores as
> needed, without allocating exclusive CPU “slots” in the scheduler.  For
> example, on an 8-core node with 16GB memory, I’d like to be able to run 3
> tasks each consuming 4GB memory and each using as much CPU as they like.
> Is this the default behavior if I don’t specify CPU restrictions to the
> scheduler?****
>
> Thanks****
>
> John****
>
> ** **
>
> ** **
>

Re: Containers and CPU

Posted by Sandy Ryza <sa...@cloudera.com>.
CPU limits are only enforced if cgroups is turned on.  With cgroups on,
they are only limited when there is contention, in which case tasks are
given CPU time in proportion to the number of cores requested for/allocated
to them.  Does that make sense?

-Sandy


On Tue, Jul 2, 2013 at 9:50 AM, Chuan Liu <ch...@microsoft.com> wrote:

>  I believe this is the default behavior.****
>
> By default, only memory limit on resources is enforced.****
>
> The capacity scheduler will use DefaultResourceCalculator to compute
> resource allocation for containers by default, which also does not take CPU
> into account.****
>
> ** **
>
> -Chuan****
>
> ** **
>
> *From:* John Lilley [mailto:john.lilley@redpoint.net]
> *Sent:* Tuesday, July 02, 2013 8:57 AM
> *To:* user@hadoop.apache.org
> *Subject:* Containers and CPU****
>
> ** **
>
> I have YARN tasks that benefit from multicore scaling.  However, they
> don’t **always** use more than one core.  I would like to allocate
> containers based only on memory, and let each task use as many cores as
> needed, without allocating exclusive CPU “slots” in the scheduler.  For
> example, on an 8-core node with 16GB memory, I’d like to be able to run 3
> tasks each consuming 4GB memory and each using as much CPU as they like.
> Is this the default behavior if I don’t specify CPU restrictions to the
> scheduler?****
>
> Thanks****
>
> John****
>
> ** **
>
> ** **
>

RE: Containers and CPU

Posted by Chuan Liu <ch...@microsoft.com>.
I believe this is the default behavior.
By default, only memory limit on resources is enforced.
The capacity scheduler will use DefaultResourceCalculator to compute resource allocation for containers by default, which also does not take CPU into account.

-Chuan

From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: Tuesday, July 02, 2013 8:57 AM
To: user@hadoop.apache.org
Subject: Containers and CPU

I have YARN tasks that benefit from multicore scaling.  However, they don't *always* use more than one core.  I would like to allocate containers based only on memory, and let each task use as many cores as needed, without allocating exclusive CPU "slots" in the scheduler.  For example, on an 8-core node with 16GB memory, I'd like to be able to run 3 tasks each consuming 4GB memory and each using as much CPU as they like.  Is this the default behavior if I don't specify CPU restrictions to the scheduler?
Thanks
John



RE: Containers and CPU

Posted by Chuan Liu <ch...@microsoft.com>.
I believe this is the default behavior.
By default, only memory limit on resources is enforced.
The capacity scheduler will use DefaultResourceCalculator to compute resource allocation for containers by default, which also does not take CPU into account.

-Chuan

From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: Tuesday, July 02, 2013 8:57 AM
To: user@hadoop.apache.org
Subject: Containers and CPU

I have YARN tasks that benefit from multicore scaling.  However, they don't *always* use more than one core.  I would like to allocate containers based only on memory, and let each task use as many cores as needed, without allocating exclusive CPU "slots" in the scheduler.  For example, on an 8-core node with 16GB memory, I'd like to be able to run 3 tasks each consuming 4GB memory and each using as much CPU as they like.  Is this the default behavior if I don't specify CPU restrictions to the scheduler?
Thanks
John



RE: Containers and CPU

Posted by Chuan Liu <ch...@microsoft.com>.
I believe this is the default behavior.
By default, only memory limit on resources is enforced.
The capacity scheduler will use DefaultResourceCalculator to compute resource allocation for containers by default, which also does not take CPU into account.

-Chuan

From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: Tuesday, July 02, 2013 8:57 AM
To: user@hadoop.apache.org
Subject: Containers and CPU

I have YARN tasks that benefit from multicore scaling.  However, they don't *always* use more than one core.  I would like to allocate containers based only on memory, and let each task use as many cores as needed, without allocating exclusive CPU "slots" in the scheduler.  For example, on an 8-core node with 16GB memory, I'd like to be able to run 3 tasks each consuming 4GB memory and each using as much CPU as they like.  Is this the default behavior if I don't specify CPU restrictions to the scheduler?
Thanks
John



RE: Containers and CPU

Posted by Chuan Liu <ch...@microsoft.com>.
I believe this is the default behavior.
By default, only memory limit on resources is enforced.
The capacity scheduler will use DefaultResourceCalculator to compute resource allocation for containers by default, which also does not take CPU into account.

-Chuan

From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: Tuesday, July 02, 2013 8:57 AM
To: user@hadoop.apache.org
Subject: Containers and CPU

I have YARN tasks that benefit from multicore scaling.  However, they don't *always* use more than one core.  I would like to allocate containers based only on memory, and let each task use as many cores as needed, without allocating exclusive CPU "slots" in the scheduler.  For example, on an 8-core node with 16GB memory, I'd like to be able to run 3 tasks each consuming 4GB memory and each using as much CPU as they like.  Is this the default behavior if I don't specify CPU restrictions to the scheduler?
Thanks
John