You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Kevin <ke...@gmail.com> on 2015/05/22 00:00:43 UTC

Using YARN with native applications

Hello,

I have been using the distributed shell application and Oozie to run native
C++ applications in the cluster. Is YARN able to see the resources these
native applications use. For example, if I use Oozie's shell action, the
NodeManager hosts the mapper container and allocates a certain amount of
memory and vcores (as configured). What happens if my C++ application uses
more memory or vcores than the NodeManager allocated?

I was looking in the Hadoop code and I couldn't find my way to answer.
Although, it seems the LinuxContainerExecutor may be the answer to my
question since it uses cgroups.

I'm interested to know how YARN reacts to non-Java applications running
inside of it.

Thanks,
Kevin

Re: Using YARN with native applications

Posted by Kevin <ke...@gmail.com>.
Ah, okay. That makes sense. Thanks for all your help, Varun.

-Kevin

On Wed, May 27, 2015 at 9:53 AM Varun Vasudev <vv...@hortonworks.com>
wrote:

>   For CPU isolation, you have to use Cgroups with the
> LinuxContainerExecutor. We don’t enforce cpu limits with the
> DefaultContainerExecutor.
>
>  -Varun
>
>   From: Kevin
> Reply-To: "user@hadoop.apache.org"
> Date: Wednesday, May 27, 2015 at 7:06 PM
>
> To: "user@hadoop.apache.org"
> Subject: Re: Using YARN with native applications
>
>   Thanks for the tip. In the trunk it looks like the NodeManager's
> monitor thread doesn't care if the process tree's cores overflows the
> container's CPU limit. Is this monitored elsewhere?
>
>  I have my eyes on
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java#L476
>
>
>  On Wed, May 27, 2015 at 9:06 AM Varun Vasudev <vv...@hortonworks.com>
> wrote:
>
>>   You should also look at ProcfsBasedProcessTree if you want to know how
>> exactly the memory usage is being calculated.
>>
>>  -Varun
>>
>>   From: Kevin
>> Reply-To: "user@hadoop.apache.org"
>> Date: Wednesday, May 27, 2015 at 6:22 PM
>>
>> To: "user@hadoop.apache.org"
>> Subject: Re: Using YARN with native applications
>>
>>   Varun, thank you for helping me understand this. You pointed out a
>> couple of new things to me. I finally found that monitoring thread in the
>> code (ContainersMonitorImpl.java). I can now see and gain a better
>> understanding of YARN checks on a container's resources.
>>
>>  On Wed, May 27, 2015 at 1:23 AM Varun Vasudev <vv...@hortonworks.com>
>> wrote:
>>
>>>  YARN should kill the container. I’m not sure what JVM you’re referring
>>> to, but the NodeManager writes and then spawns a shell script that will
>>> invoke your shell script which in turn(presumably) will invoke your C++
>>> application. A monitoring thread then looks at the memory usage of the
>>> process tree and compares it to the limits for the container.
>>>
>>>  -Varun
>>>
>>>   From: Kevin
>>> Reply-To: "user@hadoop.apache.org"
>>> Date: Tuesday, May 26, 2015 at 7:22 AM
>>> To: "user@hadoop.apache.org"
>>> Subject: Re: Using YARN with native applications
>>>
>>>   Thanks for the reply, Varun. So if I use the DefaultContainerExecutor
>>> and run a C++ application via a shell script inside a container whose
>>> virtual memory limit is, for example, 2 GB, and that application does a
>>> malloc for 3 GB, YARN will kill the container? I always just thought that
>>> YARN kept its eye on the JVM it spins up for the container (under the
>>> DefaultContainerExecutor).
>>>
>>>  -Kevin
>>>
>>> On Mon, May 25, 2015 at 4:17 AM, Varun Vasudev <vvasudev@hortonworks.com
>>> > wrote:
>>>
>>>>  Hi Kevin,
>>>>
>>>>  By default, the NodeManager monitors physical and virtual memory
>>>> usage of containers. Containers that exceed either limit are killed. Admins
>>>> can disable the checks by setting yarn.nodemanager.pmem-check-enabled
>>>> and/or yarn.nodemanager.vmem-check-enabled to false. The virtual
>>>> memory limit for a container is determined using the config variable yarn.nodemanager.vmem-pmem-ratio(default
>>>> value is 2.1).
>>>>
>>>>  In case of vcores -
>>>>
>>>>    1. If you’re using Cgroups under LinuxContainerExecutor, by
>>>>    default, if there is spare CPU available on the node, your container will
>>>>    be allowed to use it. Admins can restrict containers to use only the CPU
>>>>    allocated to them by setting yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage
>>>>    to true. This setting is only applicable when using Cgroups under
>>>>    LinuxContainerExecutor.
>>>>    2.  If you aren’t using Cgroups under LinuxContainerExecutor, there
>>>>    is no limiting of the amount of the CPU that containers can use.
>>>>
>>>>  -Varun
>>>>
>>>>   From: Kevin
>>>> Reply-To: "user@hadoop.apache.org"
>>>> Date: Friday, May 22, 2015 at 3:30 AM
>>>> To: "user@hadoop.apache.org"
>>>> Subject: Using YARN with native applications
>>>>
>>>>   Hello,
>>>>
>>>>  I have been using the distributed shell application and Oozie to run
>>>> native C++ applications in the cluster. Is YARN able to see the resources
>>>> these native applications use. For example, if I use Oozie's shell action,
>>>> the NodeManager hosts the mapper container and allocates a certain amount
>>>> of memory and vcores (as configured). What happens if my C++ application
>>>> uses more memory or vcores than the NodeManager allocated?
>>>>
>>>>  I was looking in the Hadoop code and I couldn't find my way to
>>>> answer. Although, it seems the LinuxContainerExecutor may be the answer to
>>>> my question since it uses cgroups.
>>>>
>>>>  I'm interested to know how YARN reacts to non-Java applications
>>>> running inside of it.
>>>>
>>>>  Thanks,
>>>> Kevin
>>>>
>>>
>>>

Re: Using YARN with native applications

Posted by Kevin <ke...@gmail.com>.
Ah, okay. That makes sense. Thanks for all your help, Varun.

-Kevin

On Wed, May 27, 2015 at 9:53 AM Varun Vasudev <vv...@hortonworks.com>
wrote:

>   For CPU isolation, you have to use Cgroups with the
> LinuxContainerExecutor. We don’t enforce cpu limits with the
> DefaultContainerExecutor.
>
>  -Varun
>
>   From: Kevin
> Reply-To: "user@hadoop.apache.org"
> Date: Wednesday, May 27, 2015 at 7:06 PM
>
> To: "user@hadoop.apache.org"
> Subject: Re: Using YARN with native applications
>
>   Thanks for the tip. In the trunk it looks like the NodeManager's
> monitor thread doesn't care if the process tree's cores overflows the
> container's CPU limit. Is this monitored elsewhere?
>
>  I have my eyes on
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java#L476
>
>
>  On Wed, May 27, 2015 at 9:06 AM Varun Vasudev <vv...@hortonworks.com>
> wrote:
>
>>   You should also look at ProcfsBasedProcessTree if you want to know how
>> exactly the memory usage is being calculated.
>>
>>  -Varun
>>
>>   From: Kevin
>> Reply-To: "user@hadoop.apache.org"
>> Date: Wednesday, May 27, 2015 at 6:22 PM
>>
>> To: "user@hadoop.apache.org"
>> Subject: Re: Using YARN with native applications
>>
>>   Varun, thank you for helping me understand this. You pointed out a
>> couple of new things to me. I finally found that monitoring thread in the
>> code (ContainersMonitorImpl.java). I can now see and gain a better
>> understanding of YARN checks on a container's resources.
>>
>>  On Wed, May 27, 2015 at 1:23 AM Varun Vasudev <vv...@hortonworks.com>
>> wrote:
>>
>>>  YARN should kill the container. I’m not sure what JVM you’re referring
>>> to, but the NodeManager writes and then spawns a shell script that will
>>> invoke your shell script which in turn(presumably) will invoke your C++
>>> application. A monitoring thread then looks at the memory usage of the
>>> process tree and compares it to the limits for the container.
>>>
>>>  -Varun
>>>
>>>   From: Kevin
>>> Reply-To: "user@hadoop.apache.org"
>>> Date: Tuesday, May 26, 2015 at 7:22 AM
>>> To: "user@hadoop.apache.org"
>>> Subject: Re: Using YARN with native applications
>>>
>>>   Thanks for the reply, Varun. So if I use the DefaultContainerExecutor
>>> and run a C++ application via a shell script inside a container whose
>>> virtual memory limit is, for example, 2 GB, and that application does a
>>> malloc for 3 GB, YARN will kill the container? I always just thought that
>>> YARN kept its eye on the JVM it spins up for the container (under the
>>> DefaultContainerExecutor).
>>>
>>>  -Kevin
>>>
>>> On Mon, May 25, 2015 at 4:17 AM, Varun Vasudev <vvasudev@hortonworks.com
>>> > wrote:
>>>
>>>>  Hi Kevin,
>>>>
>>>>  By default, the NodeManager monitors physical and virtual memory
>>>> usage of containers. Containers that exceed either limit are killed. Admins
>>>> can disable the checks by setting yarn.nodemanager.pmem-check-enabled
>>>> and/or yarn.nodemanager.vmem-check-enabled to false. The virtual
>>>> memory limit for a container is determined using the config variable yarn.nodemanager.vmem-pmem-ratio(default
>>>> value is 2.1).
>>>>
>>>>  In case of vcores -
>>>>
>>>>    1. If you’re using Cgroups under LinuxContainerExecutor, by
>>>>    default, if there is spare CPU available on the node, your container will
>>>>    be allowed to use it. Admins can restrict containers to use only the CPU
>>>>    allocated to them by setting yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage
>>>>    to true. This setting is only applicable when using Cgroups under
>>>>    LinuxContainerExecutor.
>>>>    2.  If you aren’t using Cgroups under LinuxContainerExecutor, there
>>>>    is no limiting of the amount of the CPU that containers can use.
>>>>
>>>>  -Varun
>>>>
>>>>   From: Kevin
>>>> Reply-To: "user@hadoop.apache.org"
>>>> Date: Friday, May 22, 2015 at 3:30 AM
>>>> To: "user@hadoop.apache.org"
>>>> Subject: Using YARN with native applications
>>>>
>>>>   Hello,
>>>>
>>>>  I have been using the distributed shell application and Oozie to run
>>>> native C++ applications in the cluster. Is YARN able to see the resources
>>>> these native applications use. For example, if I use Oozie's shell action,
>>>> the NodeManager hosts the mapper container and allocates a certain amount
>>>> of memory and vcores (as configured). What happens if my C++ application
>>>> uses more memory or vcores than the NodeManager allocated?
>>>>
>>>>  I was looking in the Hadoop code and I couldn't find my way to
>>>> answer. Although, it seems the LinuxContainerExecutor may be the answer to
>>>> my question since it uses cgroups.
>>>>
>>>>  I'm interested to know how YARN reacts to non-Java applications
>>>> running inside of it.
>>>>
>>>>  Thanks,
>>>> Kevin
>>>>
>>>
>>>

Re: Using YARN with native applications

Posted by Kevin <ke...@gmail.com>.
Ah, okay. That makes sense. Thanks for all your help, Varun.

-Kevin

On Wed, May 27, 2015 at 9:53 AM Varun Vasudev <vv...@hortonworks.com>
wrote:

>   For CPU isolation, you have to use Cgroups with the
> LinuxContainerExecutor. We don’t enforce cpu limits with the
> DefaultContainerExecutor.
>
>  -Varun
>
>   From: Kevin
> Reply-To: "user@hadoop.apache.org"
> Date: Wednesday, May 27, 2015 at 7:06 PM
>
> To: "user@hadoop.apache.org"
> Subject: Re: Using YARN with native applications
>
>   Thanks for the tip. In the trunk it looks like the NodeManager's
> monitor thread doesn't care if the process tree's cores overflows the
> container's CPU limit. Is this monitored elsewhere?
>
>  I have my eyes on
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java#L476
>
>
>  On Wed, May 27, 2015 at 9:06 AM Varun Vasudev <vv...@hortonworks.com>
> wrote:
>
>>   You should also look at ProcfsBasedProcessTree if you want to know how
>> exactly the memory usage is being calculated.
>>
>>  -Varun
>>
>>   From: Kevin
>> Reply-To: "user@hadoop.apache.org"
>> Date: Wednesday, May 27, 2015 at 6:22 PM
>>
>> To: "user@hadoop.apache.org"
>> Subject: Re: Using YARN with native applications
>>
>>   Varun, thank you for helping me understand this. You pointed out a
>> couple of new things to me. I finally found that monitoring thread in the
>> code (ContainersMonitorImpl.java). I can now see and gain a better
>> understanding of YARN checks on a container's resources.
>>
>>  On Wed, May 27, 2015 at 1:23 AM Varun Vasudev <vv...@hortonworks.com>
>> wrote:
>>
>>>  YARN should kill the container. I’m not sure what JVM you’re referring
>>> to, but the NodeManager writes and then spawns a shell script that will
>>> invoke your shell script which in turn(presumably) will invoke your C++
>>> application. A monitoring thread then looks at the memory usage of the
>>> process tree and compares it to the limits for the container.
>>>
>>>  -Varun
>>>
>>>   From: Kevin
>>> Reply-To: "user@hadoop.apache.org"
>>> Date: Tuesday, May 26, 2015 at 7:22 AM
>>> To: "user@hadoop.apache.org"
>>> Subject: Re: Using YARN with native applications
>>>
>>>   Thanks for the reply, Varun. So if I use the DefaultContainerExecutor
>>> and run a C++ application via a shell script inside a container whose
>>> virtual memory limit is, for example, 2 GB, and that application does a
>>> malloc for 3 GB, YARN will kill the container? I always just thought that
>>> YARN kept its eye on the JVM it spins up for the container (under the
>>> DefaultContainerExecutor).
>>>
>>>  -Kevin
>>>
>>> On Mon, May 25, 2015 at 4:17 AM, Varun Vasudev <vvasudev@hortonworks.com
>>> > wrote:
>>>
>>>>  Hi Kevin,
>>>>
>>>>  By default, the NodeManager monitors physical and virtual memory
>>>> usage of containers. Containers that exceed either limit are killed. Admins
>>>> can disable the checks by setting yarn.nodemanager.pmem-check-enabled
>>>> and/or yarn.nodemanager.vmem-check-enabled to false. The virtual
>>>> memory limit for a container is determined using the config variable yarn.nodemanager.vmem-pmem-ratio(default
>>>> value is 2.1).
>>>>
>>>>  In case of vcores -
>>>>
>>>>    1. If you’re using Cgroups under LinuxContainerExecutor, by
>>>>    default, if there is spare CPU available on the node, your container will
>>>>    be allowed to use it. Admins can restrict containers to use only the CPU
>>>>    allocated to them by setting yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage
>>>>    to true. This setting is only applicable when using Cgroups under
>>>>    LinuxContainerExecutor.
>>>>    2.  If you aren’t using Cgroups under LinuxContainerExecutor, there
>>>>    is no limiting of the amount of the CPU that containers can use.
>>>>
>>>>  -Varun
>>>>
>>>>   From: Kevin
>>>> Reply-To: "user@hadoop.apache.org"
>>>> Date: Friday, May 22, 2015 at 3:30 AM
>>>> To: "user@hadoop.apache.org"
>>>> Subject: Using YARN with native applications
>>>>
>>>>   Hello,
>>>>
>>>>  I have been using the distributed shell application and Oozie to run
>>>> native C++ applications in the cluster. Is YARN able to see the resources
>>>> these native applications use. For example, if I use Oozie's shell action,
>>>> the NodeManager hosts the mapper container and allocates a certain amount
>>>> of memory and vcores (as configured). What happens if my C++ application
>>>> uses more memory or vcores than the NodeManager allocated?
>>>>
>>>>  I was looking in the Hadoop code and I couldn't find my way to
>>>> answer. Although, it seems the LinuxContainerExecutor may be the answer to
>>>> my question since it uses cgroups.
>>>>
>>>>  I'm interested to know how YARN reacts to non-Java applications
>>>> running inside of it.
>>>>
>>>>  Thanks,
>>>> Kevin
>>>>
>>>
>>>

Re: Using YARN with native applications

Posted by Kevin <ke...@gmail.com>.
Ah, okay. That makes sense. Thanks for all your help, Varun.

-Kevin

On Wed, May 27, 2015 at 9:53 AM Varun Vasudev <vv...@hortonworks.com>
wrote:

>   For CPU isolation, you have to use Cgroups with the
> LinuxContainerExecutor. We don’t enforce cpu limits with the
> DefaultContainerExecutor.
>
>  -Varun
>
>   From: Kevin
> Reply-To: "user@hadoop.apache.org"
> Date: Wednesday, May 27, 2015 at 7:06 PM
>
> To: "user@hadoop.apache.org"
> Subject: Re: Using YARN with native applications
>
>   Thanks for the tip. In the trunk it looks like the NodeManager's
> monitor thread doesn't care if the process tree's cores overflows the
> container's CPU limit. Is this monitored elsewhere?
>
>  I have my eyes on
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java#L476
>
>
>  On Wed, May 27, 2015 at 9:06 AM Varun Vasudev <vv...@hortonworks.com>
> wrote:
>
>>   You should also look at ProcfsBasedProcessTree if you want to know how
>> exactly the memory usage is being calculated.
>>
>>  -Varun
>>
>>   From: Kevin
>> Reply-To: "user@hadoop.apache.org"
>> Date: Wednesday, May 27, 2015 at 6:22 PM
>>
>> To: "user@hadoop.apache.org"
>> Subject: Re: Using YARN with native applications
>>
>>   Varun, thank you for helping me understand this. You pointed out a
>> couple of new things to me. I finally found that monitoring thread in the
>> code (ContainersMonitorImpl.java). I can now see and gain a better
>> understanding of YARN checks on a container's resources.
>>
>>  On Wed, May 27, 2015 at 1:23 AM Varun Vasudev <vv...@hortonworks.com>
>> wrote:
>>
>>>  YARN should kill the container. I’m not sure what JVM you’re referring
>>> to, but the NodeManager writes and then spawns a shell script that will
>>> invoke your shell script which in turn(presumably) will invoke your C++
>>> application. A monitoring thread then looks at the memory usage of the
>>> process tree and compares it to the limits for the container.
>>>
>>>  -Varun
>>>
>>>   From: Kevin
>>> Reply-To: "user@hadoop.apache.org"
>>> Date: Tuesday, May 26, 2015 at 7:22 AM
>>> To: "user@hadoop.apache.org"
>>> Subject: Re: Using YARN with native applications
>>>
>>>   Thanks for the reply, Varun. So if I use the DefaultContainerExecutor
>>> and run a C++ application via a shell script inside a container whose
>>> virtual memory limit is, for example, 2 GB, and that application does a
>>> malloc for 3 GB, YARN will kill the container? I always just thought that
>>> YARN kept its eye on the JVM it spins up for the container (under the
>>> DefaultContainerExecutor).
>>>
>>>  -Kevin
>>>
>>> On Mon, May 25, 2015 at 4:17 AM, Varun Vasudev <vvasudev@hortonworks.com
>>> > wrote:
>>>
>>>>  Hi Kevin,
>>>>
>>>>  By default, the NodeManager monitors physical and virtual memory
>>>> usage of containers. Containers that exceed either limit are killed. Admins
>>>> can disable the checks by setting yarn.nodemanager.pmem-check-enabled
>>>> and/or yarn.nodemanager.vmem-check-enabled to false. The virtual
>>>> memory limit for a container is determined using the config variable yarn.nodemanager.vmem-pmem-ratio(default
>>>> value is 2.1).
>>>>
>>>>  In case of vcores -
>>>>
>>>>    1. If you’re using Cgroups under LinuxContainerExecutor, by
>>>>    default, if there is spare CPU available on the node, your container will
>>>>    be allowed to use it. Admins can restrict containers to use only the CPU
>>>>    allocated to them by setting yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage
>>>>    to true. This setting is only applicable when using Cgroups under
>>>>    LinuxContainerExecutor.
>>>>    2.  If you aren’t using Cgroups under LinuxContainerExecutor, there
>>>>    is no limiting of the amount of the CPU that containers can use.
>>>>
>>>>  -Varun
>>>>
>>>>   From: Kevin
>>>> Reply-To: "user@hadoop.apache.org"
>>>> Date: Friday, May 22, 2015 at 3:30 AM
>>>> To: "user@hadoop.apache.org"
>>>> Subject: Using YARN with native applications
>>>>
>>>>   Hello,
>>>>
>>>>  I have been using the distributed shell application and Oozie to run
>>>> native C++ applications in the cluster. Is YARN able to see the resources
>>>> these native applications use. For example, if I use Oozie's shell action,
>>>> the NodeManager hosts the mapper container and allocates a certain amount
>>>> of memory and vcores (as configured). What happens if my C++ application
>>>> uses more memory or vcores than the NodeManager allocated?
>>>>
>>>>  I was looking in the Hadoop code and I couldn't find my way to
>>>> answer. Although, it seems the LinuxContainerExecutor may be the answer to
>>>> my question since it uses cgroups.
>>>>
>>>>  I'm interested to know how YARN reacts to non-Java applications
>>>> running inside of it.
>>>>
>>>>  Thanks,
>>>> Kevin
>>>>
>>>
>>>

Re: Using YARN with native applications

Posted by Varun Vasudev <vv...@hortonworks.com>.
For CPU isolation, you have to use Cgroups with the LinuxContainerExecutor. We don’t enforce cpu limits with the DefaultContainerExecutor.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Wednesday, May 27, 2015 at 7:06 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Re: Using YARN with native applications

Thanks for the tip. In the trunk it looks like the NodeManager's monitor thread doesn't care if the process tree's cores overflows the container's CPU limit. Is this monitored elsewhere?

I have my eyes on https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java#L476


On Wed, May 27, 2015 at 9:06 AM Varun Vasudev <vv...@hortonworks.com>> wrote:
You should also look at ProcfsBasedProcessTree if you want to know how exactly the memory usage is being calculated.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Wednesday, May 27, 2015 at 6:22 PM

To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Re: Using YARN with native applications

Varun, thank you for helping me understand this. You pointed out a couple of new things to me. I finally found that monitoring thread in the code (ContainersMonitorImpl.java). I can now see and gain a better understanding of YARN checks on a container's resources.

On Wed, May 27, 2015 at 1:23 AM Varun Vasudev <vv...@hortonworks.com>> wrote:
YARN should kill the container. I’m not sure what JVM you’re referring to, but the NodeManager writes and then spawns a shell script that will invoke your shell script which in turn(presumably) will invoke your C++ application. A monitoring thread then looks at the memory usage of the process tree and compares it to the limits for the container.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Tuesday, May 26, 2015 at 7:22 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Re: Using YARN with native applications

Thanks for the reply, Varun. So if I use the DefaultContainerExecutor and run a C++ application via a shell script inside a container whose virtual memory limit is, for example, 2 GB, and that application does a malloc for 3 GB, YARN will kill the container? I always just thought that YARN kept its eye on the JVM it spins up for the container (under the DefaultContainerExecutor).

-Kevin

On Mon, May 25, 2015 at 4:17 AM, Varun Vasudev <vv...@hortonworks.com>> wrote:
Hi Kevin,

By default, the NodeManager monitors physical and virtual memory usage of containers. Containers that exceed either limit are killed. Admins can disable the checks by setting yarn.nodemanager.pmem-check-enabled and/or yarn.nodemanager.vmem-check-enabled to false. The virtual memory limit for a container is determined using the config variable yarn.nodemanager.vmem-pmem-ratio(default value is 2.1).

In case of vcores -

  1.  If you’re using Cgroups under LinuxContainerExecutor, by default, if there is spare CPU available on the node, your container will be allowed to use it. Admins can restrict containers to use only the CPU allocated to them by setting yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage to true. This setting is only applicable when using Cgroups under LinuxContainerExecutor.
  2.   If you aren’t using Cgroups under LinuxContainerExecutor, there is no limiting of the amount of the CPU that containers can use.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Friday, May 22, 2015 at 3:30 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Using YARN with native applications

Hello,

I have been using the distributed shell application and Oozie to run native C++ applications in the cluster. Is YARN able to see the resources these native applications use. For example, if I use Oozie's shell action, the NodeManager hosts the mapper container and allocates a certain amount of memory and vcores (as configured). What happens if my C++ application uses more memory or vcores than the NodeManager allocated?

I was looking in the Hadoop code and I couldn't find my way to answer. Although, it seems the LinuxContainerExecutor may be the answer to my question since it uses cgroups.

I'm interested to know how YARN reacts to non-Java applications running inside of it.

Thanks,
Kevin


Re: Using YARN with native applications

Posted by Varun Vasudev <vv...@hortonworks.com>.
For CPU isolation, you have to use Cgroups with the LinuxContainerExecutor. We don’t enforce cpu limits with the DefaultContainerExecutor.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Wednesday, May 27, 2015 at 7:06 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Re: Using YARN with native applications

Thanks for the tip. In the trunk it looks like the NodeManager's monitor thread doesn't care if the process tree's cores overflows the container's CPU limit. Is this monitored elsewhere?

I have my eyes on https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java#L476


On Wed, May 27, 2015 at 9:06 AM Varun Vasudev <vv...@hortonworks.com>> wrote:
You should also look at ProcfsBasedProcessTree if you want to know how exactly the memory usage is being calculated.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Wednesday, May 27, 2015 at 6:22 PM

To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Re: Using YARN with native applications

Varun, thank you for helping me understand this. You pointed out a couple of new things to me. I finally found that monitoring thread in the code (ContainersMonitorImpl.java). I can now see and gain a better understanding of YARN checks on a container's resources.

On Wed, May 27, 2015 at 1:23 AM Varun Vasudev <vv...@hortonworks.com>> wrote:
YARN should kill the container. I’m not sure what JVM you’re referring to, but the NodeManager writes and then spawns a shell script that will invoke your shell script which in turn(presumably) will invoke your C++ application. A monitoring thread then looks at the memory usage of the process tree and compares it to the limits for the container.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Tuesday, May 26, 2015 at 7:22 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Re: Using YARN with native applications

Thanks for the reply, Varun. So if I use the DefaultContainerExecutor and run a C++ application via a shell script inside a container whose virtual memory limit is, for example, 2 GB, and that application does a malloc for 3 GB, YARN will kill the container? I always just thought that YARN kept its eye on the JVM it spins up for the container (under the DefaultContainerExecutor).

-Kevin

On Mon, May 25, 2015 at 4:17 AM, Varun Vasudev <vv...@hortonworks.com>> wrote:
Hi Kevin,

By default, the NodeManager monitors physical and virtual memory usage of containers. Containers that exceed either limit are killed. Admins can disable the checks by setting yarn.nodemanager.pmem-check-enabled and/or yarn.nodemanager.vmem-check-enabled to false. The virtual memory limit for a container is determined using the config variable yarn.nodemanager.vmem-pmem-ratio(default value is 2.1).

In case of vcores -

  1.  If you’re using Cgroups under LinuxContainerExecutor, by default, if there is spare CPU available on the node, your container will be allowed to use it. Admins can restrict containers to use only the CPU allocated to them by setting yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage to true. This setting is only applicable when using Cgroups under LinuxContainerExecutor.
  2.   If you aren’t using Cgroups under LinuxContainerExecutor, there is no limiting of the amount of the CPU that containers can use.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Friday, May 22, 2015 at 3:30 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Using YARN with native applications

Hello,

I have been using the distributed shell application and Oozie to run native C++ applications in the cluster. Is YARN able to see the resources these native applications use. For example, if I use Oozie's shell action, the NodeManager hosts the mapper container and allocates a certain amount of memory and vcores (as configured). What happens if my C++ application uses more memory or vcores than the NodeManager allocated?

I was looking in the Hadoop code and I couldn't find my way to answer. Although, it seems the LinuxContainerExecutor may be the answer to my question since it uses cgroups.

I'm interested to know how YARN reacts to non-Java applications running inside of it.

Thanks,
Kevin


RE: Using YARN with native applications

Posted by "Naganarasimha G R (Naga)" <ga...@huawei.com>.
Hi Kevin,
CPU monitoring is done by cgroups. Basically for CPU,  cgroups doesn't allow the process to take more than the cpu cycles configured overall for the Node
Also as Varun mentioned :

  1.  If you’re using Cgroups under LinuxContainerExecutor, by default, if there is spare CPU available on the node, your container will be allowed to use it. Admins can restrict containers to use only the CPU allocated to them by setting yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage to true. This setting is only applicable when using Cgroups under LinuxContainerExecutor.
  2.   If you aren’t using Cgroups under LinuxContainerExecutor, there is no limiting of the amount of the CPU that containers can use.

you can further refer the doc @
http://hadoop.apache.org/docs/r2.7.0/hadoop-yarn/hadoop-yarn-site/NodeManagerCgroups.html

+ Naga
________________________________
From: Kevin [kevin.macksamie@gmail.com]
Sent: Wednesday, May 27, 2015 19:06
To: user@hadoop.apache.org
Subject: Re: Using YARN with native applications

Thanks for the tip. In the trunk it looks like the NodeManager's monitor thread doesn't care if the process tree's cores overflows the container's CPU limit. Is this monitored elsewhere?

I have my eyes on https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java#L476


On Wed, May 27, 2015 at 9:06 AM Varun Vasudev <vv...@hortonworks.com>> wrote:
You should also look at ProcfsBasedProcessTree if you want to know how exactly the memory usage is being calculated.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Wednesday, May 27, 2015 at 6:22 PM

To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Re: Using YARN with native applications

Varun, thank you for helping me understand this. You pointed out a couple of new things to me. I finally found that monitoring thread in the code (ContainersMonitorImpl.java). I can now see and gain a better understanding of YARN checks on a container's resources.

On Wed, May 27, 2015 at 1:23 AM Varun Vasudev <vv...@hortonworks.com>> wrote:
YARN should kill the container. I’m not sure what JVM you’re referring to, but the NodeManager writes and then spawns a shell script that will invoke your shell script which in turn(presumably) will invoke your C++ application. A monitoring thread then looks at the memory usage of the process tree and compares it to the limits for the container.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Tuesday, May 26, 2015 at 7:22 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Re: Using YARN with native applications

Thanks for the reply, Varun. So if I use the DefaultContainerExecutor and run a C++ application via a shell script inside a container whose virtual memory limit is, for example, 2 GB, and that application does a malloc for 3 GB, YARN will kill the container? I always just thought that YARN kept its eye on the JVM it spins up for the container (under the DefaultContainerExecutor).

-Kevin

On Mon, May 25, 2015 at 4:17 AM, Varun Vasudev <vv...@hortonworks.com>> wrote:
Hi Kevin,

By default, the NodeManager monitors physical and virtual memory usage of containers. Containers that exceed either limit are killed. Admins can disable the checks by setting yarn.nodemanager.pmem-check-enabled and/or yarn.nodemanager.vmem-check-enabled to false. The virtual memory limit for a container is determined using the config variable yarn.nodemanager.vmem-pmem-ratio(default value is 2.1).

In case of vcores -

  1.  If you’re using Cgroups under LinuxContainerExecutor, by default, if there is spare CPU available on the node, your container will be allowed to use it. Admins can restrict containers to use only the CPU allocated to them by setting yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage to true. This setting is only applicable when using Cgroups under LinuxContainerExecutor.
  2.   If you aren’t using Cgroups under LinuxContainerExecutor, there is no limiting of the amount of the CPU that containers can use.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Friday, May 22, 2015 at 3:30 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Using YARN with native applications

Hello,

I have been using the distributed shell application and Oozie to run native C++ applications in the cluster. Is YARN able to see the resources these native applications use. For example, if I use Oozie's shell action, the NodeManager hosts the mapper container and allocates a certain amount of memory and vcores (as configured). What happens if my C++ application uses more memory or vcores than the NodeManager allocated?

I was looking in the Hadoop code and I couldn't find my way to answer. Although, it seems the LinuxContainerExecutor may be the answer to my question since it uses cgroups.

I'm interested to know how YARN reacts to non-Java applications running inside of it.

Thanks,
Kevin


Re: Using YARN with native applications

Posted by Varun Vasudev <vv...@hortonworks.com>.
For CPU isolation, you have to use Cgroups with the LinuxContainerExecutor. We don’t enforce cpu limits with the DefaultContainerExecutor.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Wednesday, May 27, 2015 at 7:06 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Re: Using YARN with native applications

Thanks for the tip. In the trunk it looks like the NodeManager's monitor thread doesn't care if the process tree's cores overflows the container's CPU limit. Is this monitored elsewhere?

I have my eyes on https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java#L476


On Wed, May 27, 2015 at 9:06 AM Varun Vasudev <vv...@hortonworks.com>> wrote:
You should also look at ProcfsBasedProcessTree if you want to know how exactly the memory usage is being calculated.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Wednesday, May 27, 2015 at 6:22 PM

To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Re: Using YARN with native applications

Varun, thank you for helping me understand this. You pointed out a couple of new things to me. I finally found that monitoring thread in the code (ContainersMonitorImpl.java). I can now see and gain a better understanding of YARN checks on a container's resources.

On Wed, May 27, 2015 at 1:23 AM Varun Vasudev <vv...@hortonworks.com>> wrote:
YARN should kill the container. I’m not sure what JVM you’re referring to, but the NodeManager writes and then spawns a shell script that will invoke your shell script which in turn(presumably) will invoke your C++ application. A monitoring thread then looks at the memory usage of the process tree and compares it to the limits for the container.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Tuesday, May 26, 2015 at 7:22 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Re: Using YARN with native applications

Thanks for the reply, Varun. So if I use the DefaultContainerExecutor and run a C++ application via a shell script inside a container whose virtual memory limit is, for example, 2 GB, and that application does a malloc for 3 GB, YARN will kill the container? I always just thought that YARN kept its eye on the JVM it spins up for the container (under the DefaultContainerExecutor).

-Kevin

On Mon, May 25, 2015 at 4:17 AM, Varun Vasudev <vv...@hortonworks.com>> wrote:
Hi Kevin,

By default, the NodeManager monitors physical and virtual memory usage of containers. Containers that exceed either limit are killed. Admins can disable the checks by setting yarn.nodemanager.pmem-check-enabled and/or yarn.nodemanager.vmem-check-enabled to false. The virtual memory limit for a container is determined using the config variable yarn.nodemanager.vmem-pmem-ratio(default value is 2.1).

In case of vcores -

  1.  If you’re using Cgroups under LinuxContainerExecutor, by default, if there is spare CPU available on the node, your container will be allowed to use it. Admins can restrict containers to use only the CPU allocated to them by setting yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage to true. This setting is only applicable when using Cgroups under LinuxContainerExecutor.
  2.   If you aren’t using Cgroups under LinuxContainerExecutor, there is no limiting of the amount of the CPU that containers can use.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Friday, May 22, 2015 at 3:30 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Using YARN with native applications

Hello,

I have been using the distributed shell application and Oozie to run native C++ applications in the cluster. Is YARN able to see the resources these native applications use. For example, if I use Oozie's shell action, the NodeManager hosts the mapper container and allocates a certain amount of memory and vcores (as configured). What happens if my C++ application uses more memory or vcores than the NodeManager allocated?

I was looking in the Hadoop code and I couldn't find my way to answer. Although, it seems the LinuxContainerExecutor may be the answer to my question since it uses cgroups.

I'm interested to know how YARN reacts to non-Java applications running inside of it.

Thanks,
Kevin


Re: Using YARN with native applications

Posted by Varun Vasudev <vv...@hortonworks.com>.
For CPU isolation, you have to use Cgroups with the LinuxContainerExecutor. We don’t enforce cpu limits with the DefaultContainerExecutor.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Wednesday, May 27, 2015 at 7:06 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Re: Using YARN with native applications

Thanks for the tip. In the trunk it looks like the NodeManager's monitor thread doesn't care if the process tree's cores overflows the container's CPU limit. Is this monitored elsewhere?

I have my eyes on https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java#L476


On Wed, May 27, 2015 at 9:06 AM Varun Vasudev <vv...@hortonworks.com>> wrote:
You should also look at ProcfsBasedProcessTree if you want to know how exactly the memory usage is being calculated.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Wednesday, May 27, 2015 at 6:22 PM

To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Re: Using YARN with native applications

Varun, thank you for helping me understand this. You pointed out a couple of new things to me. I finally found that monitoring thread in the code (ContainersMonitorImpl.java). I can now see and gain a better understanding of YARN checks on a container's resources.

On Wed, May 27, 2015 at 1:23 AM Varun Vasudev <vv...@hortonworks.com>> wrote:
YARN should kill the container. I’m not sure what JVM you’re referring to, but the NodeManager writes and then spawns a shell script that will invoke your shell script which in turn(presumably) will invoke your C++ application. A monitoring thread then looks at the memory usage of the process tree and compares it to the limits for the container.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Tuesday, May 26, 2015 at 7:22 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Re: Using YARN with native applications

Thanks for the reply, Varun. So if I use the DefaultContainerExecutor and run a C++ application via a shell script inside a container whose virtual memory limit is, for example, 2 GB, and that application does a malloc for 3 GB, YARN will kill the container? I always just thought that YARN kept its eye on the JVM it spins up for the container (under the DefaultContainerExecutor).

-Kevin

On Mon, May 25, 2015 at 4:17 AM, Varun Vasudev <vv...@hortonworks.com>> wrote:
Hi Kevin,

By default, the NodeManager monitors physical and virtual memory usage of containers. Containers that exceed either limit are killed. Admins can disable the checks by setting yarn.nodemanager.pmem-check-enabled and/or yarn.nodemanager.vmem-check-enabled to false. The virtual memory limit for a container is determined using the config variable yarn.nodemanager.vmem-pmem-ratio(default value is 2.1).

In case of vcores -

  1.  If you’re using Cgroups under LinuxContainerExecutor, by default, if there is spare CPU available on the node, your container will be allowed to use it. Admins can restrict containers to use only the CPU allocated to them by setting yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage to true. This setting is only applicable when using Cgroups under LinuxContainerExecutor.
  2.   If you aren’t using Cgroups under LinuxContainerExecutor, there is no limiting of the amount of the CPU that containers can use.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Friday, May 22, 2015 at 3:30 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Using YARN with native applications

Hello,

I have been using the distributed shell application and Oozie to run native C++ applications in the cluster. Is YARN able to see the resources these native applications use. For example, if I use Oozie's shell action, the NodeManager hosts the mapper container and allocates a certain amount of memory and vcores (as configured). What happens if my C++ application uses more memory or vcores than the NodeManager allocated?

I was looking in the Hadoop code and I couldn't find my way to answer. Although, it seems the LinuxContainerExecutor may be the answer to my question since it uses cgroups.

I'm interested to know how YARN reacts to non-Java applications running inside of it.

Thanks,
Kevin


RE: Using YARN with native applications

Posted by "Naganarasimha G R (Naga)" <ga...@huawei.com>.
Hi Kevin,
CPU monitoring is done by cgroups. Basically for CPU,  cgroups doesn't allow the process to take more than the cpu cycles configured overall for the Node
Also as Varun mentioned :

  1.  If you’re using Cgroups under LinuxContainerExecutor, by default, if there is spare CPU available on the node, your container will be allowed to use it. Admins can restrict containers to use only the CPU allocated to them by setting yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage to true. This setting is only applicable when using Cgroups under LinuxContainerExecutor.
  2.   If you aren’t using Cgroups under LinuxContainerExecutor, there is no limiting of the amount of the CPU that containers can use.

you can further refer the doc @
http://hadoop.apache.org/docs/r2.7.0/hadoop-yarn/hadoop-yarn-site/NodeManagerCgroups.html

+ Naga
________________________________
From: Kevin [kevin.macksamie@gmail.com]
Sent: Wednesday, May 27, 2015 19:06
To: user@hadoop.apache.org
Subject: Re: Using YARN with native applications

Thanks for the tip. In the trunk it looks like the NodeManager's monitor thread doesn't care if the process tree's cores overflows the container's CPU limit. Is this monitored elsewhere?

I have my eyes on https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java#L476


On Wed, May 27, 2015 at 9:06 AM Varun Vasudev <vv...@hortonworks.com>> wrote:
You should also look at ProcfsBasedProcessTree if you want to know how exactly the memory usage is being calculated.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Wednesday, May 27, 2015 at 6:22 PM

To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Re: Using YARN with native applications

Varun, thank you for helping me understand this. You pointed out a couple of new things to me. I finally found that monitoring thread in the code (ContainersMonitorImpl.java). I can now see and gain a better understanding of YARN checks on a container's resources.

On Wed, May 27, 2015 at 1:23 AM Varun Vasudev <vv...@hortonworks.com>> wrote:
YARN should kill the container. I’m not sure what JVM you’re referring to, but the NodeManager writes and then spawns a shell script that will invoke your shell script which in turn(presumably) will invoke your C++ application. A monitoring thread then looks at the memory usage of the process tree and compares it to the limits for the container.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Tuesday, May 26, 2015 at 7:22 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Re: Using YARN with native applications

Thanks for the reply, Varun. So if I use the DefaultContainerExecutor and run a C++ application via a shell script inside a container whose virtual memory limit is, for example, 2 GB, and that application does a malloc for 3 GB, YARN will kill the container? I always just thought that YARN kept its eye on the JVM it spins up for the container (under the DefaultContainerExecutor).

-Kevin

On Mon, May 25, 2015 at 4:17 AM, Varun Vasudev <vv...@hortonworks.com>> wrote:
Hi Kevin,

By default, the NodeManager monitors physical and virtual memory usage of containers. Containers that exceed either limit are killed. Admins can disable the checks by setting yarn.nodemanager.pmem-check-enabled and/or yarn.nodemanager.vmem-check-enabled to false. The virtual memory limit for a container is determined using the config variable yarn.nodemanager.vmem-pmem-ratio(default value is 2.1).

In case of vcores -

  1.  If you’re using Cgroups under LinuxContainerExecutor, by default, if there is spare CPU available on the node, your container will be allowed to use it. Admins can restrict containers to use only the CPU allocated to them by setting yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage to true. This setting is only applicable when using Cgroups under LinuxContainerExecutor.
  2.   If you aren’t using Cgroups under LinuxContainerExecutor, there is no limiting of the amount of the CPU that containers can use.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Friday, May 22, 2015 at 3:30 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Using YARN with native applications

Hello,

I have been using the distributed shell application and Oozie to run native C++ applications in the cluster. Is YARN able to see the resources these native applications use. For example, if I use Oozie's shell action, the NodeManager hosts the mapper container and allocates a certain amount of memory and vcores (as configured). What happens if my C++ application uses more memory or vcores than the NodeManager allocated?

I was looking in the Hadoop code and I couldn't find my way to answer. Although, it seems the LinuxContainerExecutor may be the answer to my question since it uses cgroups.

I'm interested to know how YARN reacts to non-Java applications running inside of it.

Thanks,
Kevin


RE: Using YARN with native applications

Posted by "Naganarasimha G R (Naga)" <ga...@huawei.com>.
Hi Kevin,
CPU monitoring is done by cgroups. Basically for CPU,  cgroups doesn't allow the process to take more than the cpu cycles configured overall for the Node
Also as Varun mentioned :

  1.  If you’re using Cgroups under LinuxContainerExecutor, by default, if there is spare CPU available on the node, your container will be allowed to use it. Admins can restrict containers to use only the CPU allocated to them by setting yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage to true. This setting is only applicable when using Cgroups under LinuxContainerExecutor.
  2.   If you aren’t using Cgroups under LinuxContainerExecutor, there is no limiting of the amount of the CPU that containers can use.

you can further refer the doc @
http://hadoop.apache.org/docs/r2.7.0/hadoop-yarn/hadoop-yarn-site/NodeManagerCgroups.html

+ Naga
________________________________
From: Kevin [kevin.macksamie@gmail.com]
Sent: Wednesday, May 27, 2015 19:06
To: user@hadoop.apache.org
Subject: Re: Using YARN with native applications

Thanks for the tip. In the trunk it looks like the NodeManager's monitor thread doesn't care if the process tree's cores overflows the container's CPU limit. Is this monitored elsewhere?

I have my eyes on https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java#L476


On Wed, May 27, 2015 at 9:06 AM Varun Vasudev <vv...@hortonworks.com>> wrote:
You should also look at ProcfsBasedProcessTree if you want to know how exactly the memory usage is being calculated.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Wednesday, May 27, 2015 at 6:22 PM

To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Re: Using YARN with native applications

Varun, thank you for helping me understand this. You pointed out a couple of new things to me. I finally found that monitoring thread in the code (ContainersMonitorImpl.java). I can now see and gain a better understanding of YARN checks on a container's resources.

On Wed, May 27, 2015 at 1:23 AM Varun Vasudev <vv...@hortonworks.com>> wrote:
YARN should kill the container. I’m not sure what JVM you’re referring to, but the NodeManager writes and then spawns a shell script that will invoke your shell script which in turn(presumably) will invoke your C++ application. A monitoring thread then looks at the memory usage of the process tree and compares it to the limits for the container.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Tuesday, May 26, 2015 at 7:22 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Re: Using YARN with native applications

Thanks for the reply, Varun. So if I use the DefaultContainerExecutor and run a C++ application via a shell script inside a container whose virtual memory limit is, for example, 2 GB, and that application does a malloc for 3 GB, YARN will kill the container? I always just thought that YARN kept its eye on the JVM it spins up for the container (under the DefaultContainerExecutor).

-Kevin

On Mon, May 25, 2015 at 4:17 AM, Varun Vasudev <vv...@hortonworks.com>> wrote:
Hi Kevin,

By default, the NodeManager monitors physical and virtual memory usage of containers. Containers that exceed either limit are killed. Admins can disable the checks by setting yarn.nodemanager.pmem-check-enabled and/or yarn.nodemanager.vmem-check-enabled to false. The virtual memory limit for a container is determined using the config variable yarn.nodemanager.vmem-pmem-ratio(default value is 2.1).

In case of vcores -

  1.  If you’re using Cgroups under LinuxContainerExecutor, by default, if there is spare CPU available on the node, your container will be allowed to use it. Admins can restrict containers to use only the CPU allocated to them by setting yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage to true. This setting is only applicable when using Cgroups under LinuxContainerExecutor.
  2.   If you aren’t using Cgroups under LinuxContainerExecutor, there is no limiting of the amount of the CPU that containers can use.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Friday, May 22, 2015 at 3:30 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Using YARN with native applications

Hello,

I have been using the distributed shell application and Oozie to run native C++ applications in the cluster. Is YARN able to see the resources these native applications use. For example, if I use Oozie's shell action, the NodeManager hosts the mapper container and allocates a certain amount of memory and vcores (as configured). What happens if my C++ application uses more memory or vcores than the NodeManager allocated?

I was looking in the Hadoop code and I couldn't find my way to answer. Although, it seems the LinuxContainerExecutor may be the answer to my question since it uses cgroups.

I'm interested to know how YARN reacts to non-Java applications running inside of it.

Thanks,
Kevin


RE: Using YARN with native applications

Posted by "Naganarasimha G R (Naga)" <ga...@huawei.com>.
Hi Kevin,
CPU monitoring is done by cgroups. Basically for CPU,  cgroups doesn't allow the process to take more than the cpu cycles configured overall for the Node
Also as Varun mentioned :

  1.  If you’re using Cgroups under LinuxContainerExecutor, by default, if there is spare CPU available on the node, your container will be allowed to use it. Admins can restrict containers to use only the CPU allocated to them by setting yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage to true. This setting is only applicable when using Cgroups under LinuxContainerExecutor.
  2.   If you aren’t using Cgroups under LinuxContainerExecutor, there is no limiting of the amount of the CPU that containers can use.

you can further refer the doc @
http://hadoop.apache.org/docs/r2.7.0/hadoop-yarn/hadoop-yarn-site/NodeManagerCgroups.html

+ Naga
________________________________
From: Kevin [kevin.macksamie@gmail.com]
Sent: Wednesday, May 27, 2015 19:06
To: user@hadoop.apache.org
Subject: Re: Using YARN with native applications

Thanks for the tip. In the trunk it looks like the NodeManager's monitor thread doesn't care if the process tree's cores overflows the container's CPU limit. Is this monitored elsewhere?

I have my eyes on https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java#L476


On Wed, May 27, 2015 at 9:06 AM Varun Vasudev <vv...@hortonworks.com>> wrote:
You should also look at ProcfsBasedProcessTree if you want to know how exactly the memory usage is being calculated.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Wednesday, May 27, 2015 at 6:22 PM

To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Re: Using YARN with native applications

Varun, thank you for helping me understand this. You pointed out a couple of new things to me. I finally found that monitoring thread in the code (ContainersMonitorImpl.java). I can now see and gain a better understanding of YARN checks on a container's resources.

On Wed, May 27, 2015 at 1:23 AM Varun Vasudev <vv...@hortonworks.com>> wrote:
YARN should kill the container. I’m not sure what JVM you’re referring to, but the NodeManager writes and then spawns a shell script that will invoke your shell script which in turn(presumably) will invoke your C++ application. A monitoring thread then looks at the memory usage of the process tree and compares it to the limits for the container.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Tuesday, May 26, 2015 at 7:22 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Re: Using YARN with native applications

Thanks for the reply, Varun. So if I use the DefaultContainerExecutor and run a C++ application via a shell script inside a container whose virtual memory limit is, for example, 2 GB, and that application does a malloc for 3 GB, YARN will kill the container? I always just thought that YARN kept its eye on the JVM it spins up for the container (under the DefaultContainerExecutor).

-Kevin

On Mon, May 25, 2015 at 4:17 AM, Varun Vasudev <vv...@hortonworks.com>> wrote:
Hi Kevin,

By default, the NodeManager monitors physical and virtual memory usage of containers. Containers that exceed either limit are killed. Admins can disable the checks by setting yarn.nodemanager.pmem-check-enabled and/or yarn.nodemanager.vmem-check-enabled to false. The virtual memory limit for a container is determined using the config variable yarn.nodemanager.vmem-pmem-ratio(default value is 2.1).

In case of vcores -

  1.  If you’re using Cgroups under LinuxContainerExecutor, by default, if there is spare CPU available on the node, your container will be allowed to use it. Admins can restrict containers to use only the CPU allocated to them by setting yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage to true. This setting is only applicable when using Cgroups under LinuxContainerExecutor.
  2.   If you aren’t using Cgroups under LinuxContainerExecutor, there is no limiting of the amount of the CPU that containers can use.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Friday, May 22, 2015 at 3:30 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Using YARN with native applications

Hello,

I have been using the distributed shell application and Oozie to run native C++ applications in the cluster. Is YARN able to see the resources these native applications use. For example, if I use Oozie's shell action, the NodeManager hosts the mapper container and allocates a certain amount of memory and vcores (as configured). What happens if my C++ application uses more memory or vcores than the NodeManager allocated?

I was looking in the Hadoop code and I couldn't find my way to answer. Although, it seems the LinuxContainerExecutor may be the answer to my question since it uses cgroups.

I'm interested to know how YARN reacts to non-Java applications running inside of it.

Thanks,
Kevin


Re: Using YARN with native applications

Posted by Kevin <ke...@gmail.com>.
Thanks for the tip. In the trunk it looks like the NodeManager's monitor
thread doesn't care if the process tree's cores overflows the container's
CPU limit. Is this monitored elsewhere?

I have my eyes on
https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java#L476


On Wed, May 27, 2015 at 9:06 AM Varun Vasudev <vv...@hortonworks.com>
wrote:

>   You should also look at ProcfsBasedProcessTree if you want to know how
> exactly the memory usage is being calculated.
>
>  -Varun
>
>   From: Kevin
> Reply-To: "user@hadoop.apache.org"
> Date: Wednesday, May 27, 2015 at 6:22 PM
>
> To: "user@hadoop.apache.org"
> Subject: Re: Using YARN with native applications
>
>   Varun, thank you for helping me understand this. You pointed out a
> couple of new things to me. I finally found that monitoring thread in the
> code (ContainersMonitorImpl.java). I can now see and gain a better
> understanding of YARN checks on a container's resources.
>
>  On Wed, May 27, 2015 at 1:23 AM Varun Vasudev <vv...@hortonworks.com>
> wrote:
>
>>  YARN should kill the container. I’m not sure what JVM you’re referring
>> to, but the NodeManager writes and then spawns a shell script that will
>> invoke your shell script which in turn(presumably) will invoke your C++
>> application. A monitoring thread then looks at the memory usage of the
>> process tree and compares it to the limits for the container.
>>
>>  -Varun
>>
>>   From: Kevin
>> Reply-To: "user@hadoop.apache.org"
>> Date: Tuesday, May 26, 2015 at 7:22 AM
>> To: "user@hadoop.apache.org"
>> Subject: Re: Using YARN with native applications
>>
>>   Thanks for the reply, Varun. So if I use the DefaultContainerExecutor
>> and run a C++ application via a shell script inside a container whose
>> virtual memory limit is, for example, 2 GB, and that application does a
>> malloc for 3 GB, YARN will kill the container? I always just thought that
>> YARN kept its eye on the JVM it spins up for the container (under the
>> DefaultContainerExecutor).
>>
>>  -Kevin
>>
>> On Mon, May 25, 2015 at 4:17 AM, Varun Vasudev <vv...@hortonworks.com>
>> wrote:
>>
>>>  Hi Kevin,
>>>
>>>  By default, the NodeManager monitors physical and virtual memory usage
>>> of containers. Containers that exceed either limit are killed. Admins can
>>> disable the checks by setting yarn.nodemanager.pmem-check-enabled
>>> and/or yarn.nodemanager.vmem-check-enabled to false. The virtual memory
>>> limit for a container is determined using the config variable yarn.nodemanager.vmem-pmem-ratio(default
>>> value is 2.1).
>>>
>>>  In case of vcores -
>>>
>>>    1. If you’re using Cgroups under LinuxContainerExecutor, by default,
>>>    if there is spare CPU available on the node, your container will be allowed
>>>    to use it. Admins can restrict containers to use only the CPU allocated to
>>>    them by setting yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage
>>>    to true. This setting is only applicable when using Cgroups under
>>>    LinuxContainerExecutor.
>>>    2.  If you aren’t using Cgroups under LinuxContainerExecutor, there
>>>    is no limiting of the amount of the CPU that containers can use.
>>>
>>>  -Varun
>>>
>>>   From: Kevin
>>> Reply-To: "user@hadoop.apache.org"
>>> Date: Friday, May 22, 2015 at 3:30 AM
>>> To: "user@hadoop.apache.org"
>>> Subject: Using YARN with native applications
>>>
>>>   Hello,
>>>
>>>  I have been using the distributed shell application and Oozie to run
>>> native C++ applications in the cluster. Is YARN able to see the resources
>>> these native applications use. For example, if I use Oozie's shell action,
>>> the NodeManager hosts the mapper container and allocates a certain amount
>>> of memory and vcores (as configured). What happens if my C++ application
>>> uses more memory or vcores than the NodeManager allocated?
>>>
>>>  I was looking in the Hadoop code and I couldn't find my way to answer.
>>> Although, it seems the LinuxContainerExecutor may be the answer to my
>>> question since it uses cgroups.
>>>
>>>  I'm interested to know how YARN reacts to non-Java applications
>>> running inside of it.
>>>
>>>  Thanks,
>>> Kevin
>>>
>>
>>

Re: Using YARN with native applications

Posted by Kevin <ke...@gmail.com>.
Thanks for the tip. In the trunk it looks like the NodeManager's monitor
thread doesn't care if the process tree's cores overflows the container's
CPU limit. Is this monitored elsewhere?

I have my eyes on
https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java#L476


On Wed, May 27, 2015 at 9:06 AM Varun Vasudev <vv...@hortonworks.com>
wrote:

>   You should also look at ProcfsBasedProcessTree if you want to know how
> exactly the memory usage is being calculated.
>
>  -Varun
>
>   From: Kevin
> Reply-To: "user@hadoop.apache.org"
> Date: Wednesday, May 27, 2015 at 6:22 PM
>
> To: "user@hadoop.apache.org"
> Subject: Re: Using YARN with native applications
>
>   Varun, thank you for helping me understand this. You pointed out a
> couple of new things to me. I finally found that monitoring thread in the
> code (ContainersMonitorImpl.java). I can now see and gain a better
> understanding of YARN checks on a container's resources.
>
>  On Wed, May 27, 2015 at 1:23 AM Varun Vasudev <vv...@hortonworks.com>
> wrote:
>
>>  YARN should kill the container. I’m not sure what JVM you’re referring
>> to, but the NodeManager writes and then spawns a shell script that will
>> invoke your shell script which in turn(presumably) will invoke your C++
>> application. A monitoring thread then looks at the memory usage of the
>> process tree and compares it to the limits for the container.
>>
>>  -Varun
>>
>>   From: Kevin
>> Reply-To: "user@hadoop.apache.org"
>> Date: Tuesday, May 26, 2015 at 7:22 AM
>> To: "user@hadoop.apache.org"
>> Subject: Re: Using YARN with native applications
>>
>>   Thanks for the reply, Varun. So if I use the DefaultContainerExecutor
>> and run a C++ application via a shell script inside a container whose
>> virtual memory limit is, for example, 2 GB, and that application does a
>> malloc for 3 GB, YARN will kill the container? I always just thought that
>> YARN kept its eye on the JVM it spins up for the container (under the
>> DefaultContainerExecutor).
>>
>>  -Kevin
>>
>> On Mon, May 25, 2015 at 4:17 AM, Varun Vasudev <vv...@hortonworks.com>
>> wrote:
>>
>>>  Hi Kevin,
>>>
>>>  By default, the NodeManager monitors physical and virtual memory usage
>>> of containers. Containers that exceed either limit are killed. Admins can
>>> disable the checks by setting yarn.nodemanager.pmem-check-enabled
>>> and/or yarn.nodemanager.vmem-check-enabled to false. The virtual memory
>>> limit for a container is determined using the config variable yarn.nodemanager.vmem-pmem-ratio(default
>>> value is 2.1).
>>>
>>>  In case of vcores -
>>>
>>>    1. If you’re using Cgroups under LinuxContainerExecutor, by default,
>>>    if there is spare CPU available on the node, your container will be allowed
>>>    to use it. Admins can restrict containers to use only the CPU allocated to
>>>    them by setting yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage
>>>    to true. This setting is only applicable when using Cgroups under
>>>    LinuxContainerExecutor.
>>>    2.  If you aren’t using Cgroups under LinuxContainerExecutor, there
>>>    is no limiting of the amount of the CPU that containers can use.
>>>
>>>  -Varun
>>>
>>>   From: Kevin
>>> Reply-To: "user@hadoop.apache.org"
>>> Date: Friday, May 22, 2015 at 3:30 AM
>>> To: "user@hadoop.apache.org"
>>> Subject: Using YARN with native applications
>>>
>>>   Hello,
>>>
>>>  I have been using the distributed shell application and Oozie to run
>>> native C++ applications in the cluster. Is YARN able to see the resources
>>> these native applications use. For example, if I use Oozie's shell action,
>>> the NodeManager hosts the mapper container and allocates a certain amount
>>> of memory and vcores (as configured). What happens if my C++ application
>>> uses more memory or vcores than the NodeManager allocated?
>>>
>>>  I was looking in the Hadoop code and I couldn't find my way to answer.
>>> Although, it seems the LinuxContainerExecutor may be the answer to my
>>> question since it uses cgroups.
>>>
>>>  I'm interested to know how YARN reacts to non-Java applications
>>> running inside of it.
>>>
>>>  Thanks,
>>> Kevin
>>>
>>
>>

Re: Using YARN with native applications

Posted by Kevin <ke...@gmail.com>.
Thanks for the tip. In the trunk it looks like the NodeManager's monitor
thread doesn't care if the process tree's cores overflows the container's
CPU limit. Is this monitored elsewhere?

I have my eyes on
https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java#L476


On Wed, May 27, 2015 at 9:06 AM Varun Vasudev <vv...@hortonworks.com>
wrote:

>   You should also look at ProcfsBasedProcessTree if you want to know how
> exactly the memory usage is being calculated.
>
>  -Varun
>
>   From: Kevin
> Reply-To: "user@hadoop.apache.org"
> Date: Wednesday, May 27, 2015 at 6:22 PM
>
> To: "user@hadoop.apache.org"
> Subject: Re: Using YARN with native applications
>
>   Varun, thank you for helping me understand this. You pointed out a
> couple of new things to me. I finally found that monitoring thread in the
> code (ContainersMonitorImpl.java). I can now see and gain a better
> understanding of YARN checks on a container's resources.
>
>  On Wed, May 27, 2015 at 1:23 AM Varun Vasudev <vv...@hortonworks.com>
> wrote:
>
>>  YARN should kill the container. I’m not sure what JVM you’re referring
>> to, but the NodeManager writes and then spawns a shell script that will
>> invoke your shell script which in turn(presumably) will invoke your C++
>> application. A monitoring thread then looks at the memory usage of the
>> process tree and compares it to the limits for the container.
>>
>>  -Varun
>>
>>   From: Kevin
>> Reply-To: "user@hadoop.apache.org"
>> Date: Tuesday, May 26, 2015 at 7:22 AM
>> To: "user@hadoop.apache.org"
>> Subject: Re: Using YARN with native applications
>>
>>   Thanks for the reply, Varun. So if I use the DefaultContainerExecutor
>> and run a C++ application via a shell script inside a container whose
>> virtual memory limit is, for example, 2 GB, and that application does a
>> malloc for 3 GB, YARN will kill the container? I always just thought that
>> YARN kept its eye on the JVM it spins up for the container (under the
>> DefaultContainerExecutor).
>>
>>  -Kevin
>>
>> On Mon, May 25, 2015 at 4:17 AM, Varun Vasudev <vv...@hortonworks.com>
>> wrote:
>>
>>>  Hi Kevin,
>>>
>>>  By default, the NodeManager monitors physical and virtual memory usage
>>> of containers. Containers that exceed either limit are killed. Admins can
>>> disable the checks by setting yarn.nodemanager.pmem-check-enabled
>>> and/or yarn.nodemanager.vmem-check-enabled to false. The virtual memory
>>> limit for a container is determined using the config variable yarn.nodemanager.vmem-pmem-ratio(default
>>> value is 2.1).
>>>
>>>  In case of vcores -
>>>
>>>    1. If you’re using Cgroups under LinuxContainerExecutor, by default,
>>>    if there is spare CPU available on the node, your container will be allowed
>>>    to use it. Admins can restrict containers to use only the CPU allocated to
>>>    them by setting yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage
>>>    to true. This setting is only applicable when using Cgroups under
>>>    LinuxContainerExecutor.
>>>    2.  If you aren’t using Cgroups under LinuxContainerExecutor, there
>>>    is no limiting of the amount of the CPU that containers can use.
>>>
>>>  -Varun
>>>
>>>   From: Kevin
>>> Reply-To: "user@hadoop.apache.org"
>>> Date: Friday, May 22, 2015 at 3:30 AM
>>> To: "user@hadoop.apache.org"
>>> Subject: Using YARN with native applications
>>>
>>>   Hello,
>>>
>>>  I have been using the distributed shell application and Oozie to run
>>> native C++ applications in the cluster. Is YARN able to see the resources
>>> these native applications use. For example, if I use Oozie's shell action,
>>> the NodeManager hosts the mapper container and allocates a certain amount
>>> of memory and vcores (as configured). What happens if my C++ application
>>> uses more memory or vcores than the NodeManager allocated?
>>>
>>>  I was looking in the Hadoop code and I couldn't find my way to answer.
>>> Although, it seems the LinuxContainerExecutor may be the answer to my
>>> question since it uses cgroups.
>>>
>>>  I'm interested to know how YARN reacts to non-Java applications
>>> running inside of it.
>>>
>>>  Thanks,
>>> Kevin
>>>
>>
>>

Re: Using YARN with native applications

Posted by Kevin <ke...@gmail.com>.
Thanks for the tip. In the trunk it looks like the NodeManager's monitor
thread doesn't care if the process tree's cores overflows the container's
CPU limit. Is this monitored elsewhere?

I have my eyes on
https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java#L476


On Wed, May 27, 2015 at 9:06 AM Varun Vasudev <vv...@hortonworks.com>
wrote:

>   You should also look at ProcfsBasedProcessTree if you want to know how
> exactly the memory usage is being calculated.
>
>  -Varun
>
>   From: Kevin
> Reply-To: "user@hadoop.apache.org"
> Date: Wednesday, May 27, 2015 at 6:22 PM
>
> To: "user@hadoop.apache.org"
> Subject: Re: Using YARN with native applications
>
>   Varun, thank you for helping me understand this. You pointed out a
> couple of new things to me. I finally found that monitoring thread in the
> code (ContainersMonitorImpl.java). I can now see and gain a better
> understanding of YARN checks on a container's resources.
>
>  On Wed, May 27, 2015 at 1:23 AM Varun Vasudev <vv...@hortonworks.com>
> wrote:
>
>>  YARN should kill the container. I’m not sure what JVM you’re referring
>> to, but the NodeManager writes and then spawns a shell script that will
>> invoke your shell script which in turn(presumably) will invoke your C++
>> application. A monitoring thread then looks at the memory usage of the
>> process tree and compares it to the limits for the container.
>>
>>  -Varun
>>
>>   From: Kevin
>> Reply-To: "user@hadoop.apache.org"
>> Date: Tuesday, May 26, 2015 at 7:22 AM
>> To: "user@hadoop.apache.org"
>> Subject: Re: Using YARN with native applications
>>
>>   Thanks for the reply, Varun. So if I use the DefaultContainerExecutor
>> and run a C++ application via a shell script inside a container whose
>> virtual memory limit is, for example, 2 GB, and that application does a
>> malloc for 3 GB, YARN will kill the container? I always just thought that
>> YARN kept its eye on the JVM it spins up for the container (under the
>> DefaultContainerExecutor).
>>
>>  -Kevin
>>
>> On Mon, May 25, 2015 at 4:17 AM, Varun Vasudev <vv...@hortonworks.com>
>> wrote:
>>
>>>  Hi Kevin,
>>>
>>>  By default, the NodeManager monitors physical and virtual memory usage
>>> of containers. Containers that exceed either limit are killed. Admins can
>>> disable the checks by setting yarn.nodemanager.pmem-check-enabled
>>> and/or yarn.nodemanager.vmem-check-enabled to false. The virtual memory
>>> limit for a container is determined using the config variable yarn.nodemanager.vmem-pmem-ratio(default
>>> value is 2.1).
>>>
>>>  In case of vcores -
>>>
>>>    1. If you’re using Cgroups under LinuxContainerExecutor, by default,
>>>    if there is spare CPU available on the node, your container will be allowed
>>>    to use it. Admins can restrict containers to use only the CPU allocated to
>>>    them by setting yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage
>>>    to true. This setting is only applicable when using Cgroups under
>>>    LinuxContainerExecutor.
>>>    2.  If you aren’t using Cgroups under LinuxContainerExecutor, there
>>>    is no limiting of the amount of the CPU that containers can use.
>>>
>>>  -Varun
>>>
>>>   From: Kevin
>>> Reply-To: "user@hadoop.apache.org"
>>> Date: Friday, May 22, 2015 at 3:30 AM
>>> To: "user@hadoop.apache.org"
>>> Subject: Using YARN with native applications
>>>
>>>   Hello,
>>>
>>>  I have been using the distributed shell application and Oozie to run
>>> native C++ applications in the cluster. Is YARN able to see the resources
>>> these native applications use. For example, if I use Oozie's shell action,
>>> the NodeManager hosts the mapper container and allocates a certain amount
>>> of memory and vcores (as configured). What happens if my C++ application
>>> uses more memory or vcores than the NodeManager allocated?
>>>
>>>  I was looking in the Hadoop code and I couldn't find my way to answer.
>>> Although, it seems the LinuxContainerExecutor may be the answer to my
>>> question since it uses cgroups.
>>>
>>>  I'm interested to know how YARN reacts to non-Java applications
>>> running inside of it.
>>>
>>>  Thanks,
>>> Kevin
>>>
>>
>>

Re: Using YARN with native applications

Posted by Varun Vasudev <vv...@hortonworks.com>.
You should also look at ProcfsBasedProcessTree if you want to know how exactly the memory usage is being calculated.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Wednesday, May 27, 2015 at 6:22 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Re: Using YARN with native applications

Varun, thank you for helping me understand this. You pointed out a couple of new things to me. I finally found that monitoring thread in the code (ContainersMonitorImpl.java). I can now see and gain a better understanding of YARN checks on a container's resources.

On Wed, May 27, 2015 at 1:23 AM Varun Vasudev <vv...@hortonworks.com>> wrote:
YARN should kill the container. I’m not sure what JVM you’re referring to, but the NodeManager writes and then spawns a shell script that will invoke your shell script which in turn(presumably) will invoke your C++ application. A monitoring thread then looks at the memory usage of the process tree and compares it to the limits for the container.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Tuesday, May 26, 2015 at 7:22 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Re: Using YARN with native applications

Thanks for the reply, Varun. So if I use the DefaultContainerExecutor and run a C++ application via a shell script inside a container whose virtual memory limit is, for example, 2 GB, and that application does a malloc for 3 GB, YARN will kill the container? I always just thought that YARN kept its eye on the JVM it spins up for the container (under the DefaultContainerExecutor).

-Kevin

On Mon, May 25, 2015 at 4:17 AM, Varun Vasudev <vv...@hortonworks.com>> wrote:
Hi Kevin,

By default, the NodeManager monitors physical and virtual memory usage of containers. Containers that exceed either limit are killed. Admins can disable the checks by setting yarn.nodemanager.pmem-check-enabled and/or yarn.nodemanager.vmem-check-enabled to false. The virtual memory limit for a container is determined using the config variable yarn.nodemanager.vmem-pmem-ratio(default value is 2.1).

In case of vcores -

  1.  If you’re using Cgroups under LinuxContainerExecutor, by default, if there is spare CPU available on the node, your container will be allowed to use it. Admins can restrict containers to use only the CPU allocated to them by setting yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage to true. This setting is only applicable when using Cgroups under LinuxContainerExecutor.
  2.   If you aren’t using Cgroups under LinuxContainerExecutor, there is no limiting of the amount of the CPU that containers can use.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Friday, May 22, 2015 at 3:30 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Using YARN with native applications

Hello,

I have been using the distributed shell application and Oozie to run native C++ applications in the cluster. Is YARN able to see the resources these native applications use. For example, if I use Oozie's shell action, the NodeManager hosts the mapper container and allocates a certain amount of memory and vcores (as configured). What happens if my C++ application uses more memory or vcores than the NodeManager allocated?

I was looking in the Hadoop code and I couldn't find my way to answer. Although, it seems the LinuxContainerExecutor may be the answer to my question since it uses cgroups.

I'm interested to know how YARN reacts to non-Java applications running inside of it.

Thanks,
Kevin


Re: Using YARN with native applications

Posted by Varun Vasudev <vv...@hortonworks.com>.
You should also look at ProcfsBasedProcessTree if you want to know how exactly the memory usage is being calculated.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Wednesday, May 27, 2015 at 6:22 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Re: Using YARN with native applications

Varun, thank you for helping me understand this. You pointed out a couple of new things to me. I finally found that monitoring thread in the code (ContainersMonitorImpl.java). I can now see and gain a better understanding of YARN checks on a container's resources.

On Wed, May 27, 2015 at 1:23 AM Varun Vasudev <vv...@hortonworks.com>> wrote:
YARN should kill the container. I’m not sure what JVM you’re referring to, but the NodeManager writes and then spawns a shell script that will invoke your shell script which in turn(presumably) will invoke your C++ application. A monitoring thread then looks at the memory usage of the process tree and compares it to the limits for the container.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Tuesday, May 26, 2015 at 7:22 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Re: Using YARN with native applications

Thanks for the reply, Varun. So if I use the DefaultContainerExecutor and run a C++ application via a shell script inside a container whose virtual memory limit is, for example, 2 GB, and that application does a malloc for 3 GB, YARN will kill the container? I always just thought that YARN kept its eye on the JVM it spins up for the container (under the DefaultContainerExecutor).

-Kevin

On Mon, May 25, 2015 at 4:17 AM, Varun Vasudev <vv...@hortonworks.com>> wrote:
Hi Kevin,

By default, the NodeManager monitors physical and virtual memory usage of containers. Containers that exceed either limit are killed. Admins can disable the checks by setting yarn.nodemanager.pmem-check-enabled and/or yarn.nodemanager.vmem-check-enabled to false. The virtual memory limit for a container is determined using the config variable yarn.nodemanager.vmem-pmem-ratio(default value is 2.1).

In case of vcores -

  1.  If you’re using Cgroups under LinuxContainerExecutor, by default, if there is spare CPU available on the node, your container will be allowed to use it. Admins can restrict containers to use only the CPU allocated to them by setting yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage to true. This setting is only applicable when using Cgroups under LinuxContainerExecutor.
  2.   If you aren’t using Cgroups under LinuxContainerExecutor, there is no limiting of the amount of the CPU that containers can use.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Friday, May 22, 2015 at 3:30 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Using YARN with native applications

Hello,

I have been using the distributed shell application and Oozie to run native C++ applications in the cluster. Is YARN able to see the resources these native applications use. For example, if I use Oozie's shell action, the NodeManager hosts the mapper container and allocates a certain amount of memory and vcores (as configured). What happens if my C++ application uses more memory or vcores than the NodeManager allocated?

I was looking in the Hadoop code and I couldn't find my way to answer. Although, it seems the LinuxContainerExecutor may be the answer to my question since it uses cgroups.

I'm interested to know how YARN reacts to non-Java applications running inside of it.

Thanks,
Kevin


Re: Using YARN with native applications

Posted by Varun Vasudev <vv...@hortonworks.com>.
You should also look at ProcfsBasedProcessTree if you want to know how exactly the memory usage is being calculated.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Wednesday, May 27, 2015 at 6:22 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Re: Using YARN with native applications

Varun, thank you for helping me understand this. You pointed out a couple of new things to me. I finally found that monitoring thread in the code (ContainersMonitorImpl.java). I can now see and gain a better understanding of YARN checks on a container's resources.

On Wed, May 27, 2015 at 1:23 AM Varun Vasudev <vv...@hortonworks.com>> wrote:
YARN should kill the container. I’m not sure what JVM you’re referring to, but the NodeManager writes and then spawns a shell script that will invoke your shell script which in turn(presumably) will invoke your C++ application. A monitoring thread then looks at the memory usage of the process tree and compares it to the limits for the container.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Tuesday, May 26, 2015 at 7:22 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Re: Using YARN with native applications

Thanks for the reply, Varun. So if I use the DefaultContainerExecutor and run a C++ application via a shell script inside a container whose virtual memory limit is, for example, 2 GB, and that application does a malloc for 3 GB, YARN will kill the container? I always just thought that YARN kept its eye on the JVM it spins up for the container (under the DefaultContainerExecutor).

-Kevin

On Mon, May 25, 2015 at 4:17 AM, Varun Vasudev <vv...@hortonworks.com>> wrote:
Hi Kevin,

By default, the NodeManager monitors physical and virtual memory usage of containers. Containers that exceed either limit are killed. Admins can disable the checks by setting yarn.nodemanager.pmem-check-enabled and/or yarn.nodemanager.vmem-check-enabled to false. The virtual memory limit for a container is determined using the config variable yarn.nodemanager.vmem-pmem-ratio(default value is 2.1).

In case of vcores -

  1.  If you’re using Cgroups under LinuxContainerExecutor, by default, if there is spare CPU available on the node, your container will be allowed to use it. Admins can restrict containers to use only the CPU allocated to them by setting yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage to true. This setting is only applicable when using Cgroups under LinuxContainerExecutor.
  2.   If you aren’t using Cgroups under LinuxContainerExecutor, there is no limiting of the amount of the CPU that containers can use.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Friday, May 22, 2015 at 3:30 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Using YARN with native applications

Hello,

I have been using the distributed shell application and Oozie to run native C++ applications in the cluster. Is YARN able to see the resources these native applications use. For example, if I use Oozie's shell action, the NodeManager hosts the mapper container and allocates a certain amount of memory and vcores (as configured). What happens if my C++ application uses more memory or vcores than the NodeManager allocated?

I was looking in the Hadoop code and I couldn't find my way to answer. Although, it seems the LinuxContainerExecutor may be the answer to my question since it uses cgroups.

I'm interested to know how YARN reacts to non-Java applications running inside of it.

Thanks,
Kevin


Re: Using YARN with native applications

Posted by Varun Vasudev <vv...@hortonworks.com>.
You should also look at ProcfsBasedProcessTree if you want to know how exactly the memory usage is being calculated.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Wednesday, May 27, 2015 at 6:22 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Re: Using YARN with native applications

Varun, thank you for helping me understand this. You pointed out a couple of new things to me. I finally found that monitoring thread in the code (ContainersMonitorImpl.java). I can now see and gain a better understanding of YARN checks on a container's resources.

On Wed, May 27, 2015 at 1:23 AM Varun Vasudev <vv...@hortonworks.com>> wrote:
YARN should kill the container. I’m not sure what JVM you’re referring to, but the NodeManager writes and then spawns a shell script that will invoke your shell script which in turn(presumably) will invoke your C++ application. A monitoring thread then looks at the memory usage of the process tree and compares it to the limits for the container.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Tuesday, May 26, 2015 at 7:22 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Re: Using YARN with native applications

Thanks for the reply, Varun. So if I use the DefaultContainerExecutor and run a C++ application via a shell script inside a container whose virtual memory limit is, for example, 2 GB, and that application does a malloc for 3 GB, YARN will kill the container? I always just thought that YARN kept its eye on the JVM it spins up for the container (under the DefaultContainerExecutor).

-Kevin

On Mon, May 25, 2015 at 4:17 AM, Varun Vasudev <vv...@hortonworks.com>> wrote:
Hi Kevin,

By default, the NodeManager monitors physical and virtual memory usage of containers. Containers that exceed either limit are killed. Admins can disable the checks by setting yarn.nodemanager.pmem-check-enabled and/or yarn.nodemanager.vmem-check-enabled to false. The virtual memory limit for a container is determined using the config variable yarn.nodemanager.vmem-pmem-ratio(default value is 2.1).

In case of vcores -

  1.  If you’re using Cgroups under LinuxContainerExecutor, by default, if there is spare CPU available on the node, your container will be allowed to use it. Admins can restrict containers to use only the CPU allocated to them by setting yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage to true. This setting is only applicable when using Cgroups under LinuxContainerExecutor.
  2.   If you aren’t using Cgroups under LinuxContainerExecutor, there is no limiting of the amount of the CPU that containers can use.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Friday, May 22, 2015 at 3:30 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Using YARN with native applications

Hello,

I have been using the distributed shell application and Oozie to run native C++ applications in the cluster. Is YARN able to see the resources these native applications use. For example, if I use Oozie's shell action, the NodeManager hosts the mapper container and allocates a certain amount of memory and vcores (as configured). What happens if my C++ application uses more memory or vcores than the NodeManager allocated?

I was looking in the Hadoop code and I couldn't find my way to answer. Although, it seems the LinuxContainerExecutor may be the answer to my question since it uses cgroups.

I'm interested to know how YARN reacts to non-Java applications running inside of it.

Thanks,
Kevin


Re: Using YARN with native applications

Posted by Kevin <ke...@gmail.com>.
Varun, thank you for helping me understand this. You pointed out a couple
of new things to me. I finally found that monitoring thread in the code
(ContainersMonitorImpl.java). I can now see and gain a better understanding
of YARN checks on a container's resources.

On Wed, May 27, 2015 at 1:23 AM Varun Vasudev <vv...@hortonworks.com>
wrote:

>  YARN should kill the container. I’m not sure what JVM you’re referring
> to, but the NodeManager writes and then spawns a shell script that will
> invoke your shell script which in turn(presumably) will invoke your C++
> application. A monitoring thread then looks at the memory usage of the
> process tree and compares it to the limits for the container.
>
>  -Varun
>
>   From: Kevin
> Reply-To: "user@hadoop.apache.org"
> Date: Tuesday, May 26, 2015 at 7:22 AM
> To: "user@hadoop.apache.org"
> Subject: Re: Using YARN with native applications
>
>   Thanks for the reply, Varun. So if I use the DefaultContainerExecutor
> and run a C++ application via a shell script inside a container whose
> virtual memory limit is, for example, 2 GB, and that application does a
> malloc for 3 GB, YARN will kill the container? I always just thought that
> YARN kept its eye on the JVM it spins up for the container (under the
> DefaultContainerExecutor).
>
>  -Kevin
>
> On Mon, May 25, 2015 at 4:17 AM, Varun Vasudev <vv...@hortonworks.com>
> wrote:
>
>>  Hi Kevin,
>>
>>  By default, the NodeManager monitors physical and virtual memory usage
>> of containers. Containers that exceed either limit are killed. Admins can
>> disable the checks by setting yarn.nodemanager.pmem-check-enabled and/or yarn.nodemanager.vmem-check-enabled
>> to false. The virtual memory limit for a container is determined using the
>> config variable yarn.nodemanager.vmem-pmem-ratio(default value is 2.1).
>>
>>  In case of vcores -
>>
>>    1. If you’re using Cgroups under LinuxContainerExecutor, by default,
>>    if there is spare CPU available on the node, your container will be allowed
>>    to use it. Admins can restrict containers to use only the CPU allocated to
>>    them by setting yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage
>>    to true. This setting is only applicable when using Cgroups under
>>    LinuxContainerExecutor.
>>    2.  If you aren’t using Cgroups under LinuxContainerExecutor, there
>>    is no limiting of the amount of the CPU that containers can use.
>>
>>  -Varun
>>
>>   From: Kevin
>> Reply-To: "user@hadoop.apache.org"
>> Date: Friday, May 22, 2015 at 3:30 AM
>> To: "user@hadoop.apache.org"
>> Subject: Using YARN with native applications
>>
>>   Hello,
>>
>>  I have been using the distributed shell application and Oozie to run
>> native C++ applications in the cluster. Is YARN able to see the resources
>> these native applications use. For example, if I use Oozie's shell action,
>> the NodeManager hosts the mapper container and allocates a certain amount
>> of memory and vcores (as configured). What happens if my C++ application
>> uses more memory or vcores than the NodeManager allocated?
>>
>>  I was looking in the Hadoop code and I couldn't find my way to answer.
>> Although, it seems the LinuxContainerExecutor may be the answer to my
>> question since it uses cgroups.
>>
>>  I'm interested to know how YARN reacts to non-Java applications running
>> inside of it.
>>
>>  Thanks,
>> Kevin
>>
>
>

Re: Using YARN with native applications

Posted by Kevin <ke...@gmail.com>.
Varun, thank you for helping me understand this. You pointed out a couple
of new things to me. I finally found that monitoring thread in the code
(ContainersMonitorImpl.java). I can now see and gain a better understanding
of YARN checks on a container's resources.

On Wed, May 27, 2015 at 1:23 AM Varun Vasudev <vv...@hortonworks.com>
wrote:

>  YARN should kill the container. I’m not sure what JVM you’re referring
> to, but the NodeManager writes and then spawns a shell script that will
> invoke your shell script which in turn(presumably) will invoke your C++
> application. A monitoring thread then looks at the memory usage of the
> process tree and compares it to the limits for the container.
>
>  -Varun
>
>   From: Kevin
> Reply-To: "user@hadoop.apache.org"
> Date: Tuesday, May 26, 2015 at 7:22 AM
> To: "user@hadoop.apache.org"
> Subject: Re: Using YARN with native applications
>
>   Thanks for the reply, Varun. So if I use the DefaultContainerExecutor
> and run a C++ application via a shell script inside a container whose
> virtual memory limit is, for example, 2 GB, and that application does a
> malloc for 3 GB, YARN will kill the container? I always just thought that
> YARN kept its eye on the JVM it spins up for the container (under the
> DefaultContainerExecutor).
>
>  -Kevin
>
> On Mon, May 25, 2015 at 4:17 AM, Varun Vasudev <vv...@hortonworks.com>
> wrote:
>
>>  Hi Kevin,
>>
>>  By default, the NodeManager monitors physical and virtual memory usage
>> of containers. Containers that exceed either limit are killed. Admins can
>> disable the checks by setting yarn.nodemanager.pmem-check-enabled and/or yarn.nodemanager.vmem-check-enabled
>> to false. The virtual memory limit for a container is determined using the
>> config variable yarn.nodemanager.vmem-pmem-ratio(default value is 2.1).
>>
>>  In case of vcores -
>>
>>    1. If you’re using Cgroups under LinuxContainerExecutor, by default,
>>    if there is spare CPU available on the node, your container will be allowed
>>    to use it. Admins can restrict containers to use only the CPU allocated to
>>    them by setting yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage
>>    to true. This setting is only applicable when using Cgroups under
>>    LinuxContainerExecutor.
>>    2.  If you aren’t using Cgroups under LinuxContainerExecutor, there
>>    is no limiting of the amount of the CPU that containers can use.
>>
>>  -Varun
>>
>>   From: Kevin
>> Reply-To: "user@hadoop.apache.org"
>> Date: Friday, May 22, 2015 at 3:30 AM
>> To: "user@hadoop.apache.org"
>> Subject: Using YARN with native applications
>>
>>   Hello,
>>
>>  I have been using the distributed shell application and Oozie to run
>> native C++ applications in the cluster. Is YARN able to see the resources
>> these native applications use. For example, if I use Oozie's shell action,
>> the NodeManager hosts the mapper container and allocates a certain amount
>> of memory and vcores (as configured). What happens if my C++ application
>> uses more memory or vcores than the NodeManager allocated?
>>
>>  I was looking in the Hadoop code and I couldn't find my way to answer.
>> Although, it seems the LinuxContainerExecutor may be the answer to my
>> question since it uses cgroups.
>>
>>  I'm interested to know how YARN reacts to non-Java applications running
>> inside of it.
>>
>>  Thanks,
>> Kevin
>>
>
>

Re: Using YARN with native applications

Posted by Kevin <ke...@gmail.com>.
Varun, thank you for helping me understand this. You pointed out a couple
of new things to me. I finally found that monitoring thread in the code
(ContainersMonitorImpl.java). I can now see and gain a better understanding
of YARN checks on a container's resources.

On Wed, May 27, 2015 at 1:23 AM Varun Vasudev <vv...@hortonworks.com>
wrote:

>  YARN should kill the container. I’m not sure what JVM you’re referring
> to, but the NodeManager writes and then spawns a shell script that will
> invoke your shell script which in turn(presumably) will invoke your C++
> application. A monitoring thread then looks at the memory usage of the
> process tree and compares it to the limits for the container.
>
>  -Varun
>
>   From: Kevin
> Reply-To: "user@hadoop.apache.org"
> Date: Tuesday, May 26, 2015 at 7:22 AM
> To: "user@hadoop.apache.org"
> Subject: Re: Using YARN with native applications
>
>   Thanks for the reply, Varun. So if I use the DefaultContainerExecutor
> and run a C++ application via a shell script inside a container whose
> virtual memory limit is, for example, 2 GB, and that application does a
> malloc for 3 GB, YARN will kill the container? I always just thought that
> YARN kept its eye on the JVM it spins up for the container (under the
> DefaultContainerExecutor).
>
>  -Kevin
>
> On Mon, May 25, 2015 at 4:17 AM, Varun Vasudev <vv...@hortonworks.com>
> wrote:
>
>>  Hi Kevin,
>>
>>  By default, the NodeManager monitors physical and virtual memory usage
>> of containers. Containers that exceed either limit are killed. Admins can
>> disable the checks by setting yarn.nodemanager.pmem-check-enabled and/or yarn.nodemanager.vmem-check-enabled
>> to false. The virtual memory limit for a container is determined using the
>> config variable yarn.nodemanager.vmem-pmem-ratio(default value is 2.1).
>>
>>  In case of vcores -
>>
>>    1. If you’re using Cgroups under LinuxContainerExecutor, by default,
>>    if there is spare CPU available on the node, your container will be allowed
>>    to use it. Admins can restrict containers to use only the CPU allocated to
>>    them by setting yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage
>>    to true. This setting is only applicable when using Cgroups under
>>    LinuxContainerExecutor.
>>    2.  If you aren’t using Cgroups under LinuxContainerExecutor, there
>>    is no limiting of the amount of the CPU that containers can use.
>>
>>  -Varun
>>
>>   From: Kevin
>> Reply-To: "user@hadoop.apache.org"
>> Date: Friday, May 22, 2015 at 3:30 AM
>> To: "user@hadoop.apache.org"
>> Subject: Using YARN with native applications
>>
>>   Hello,
>>
>>  I have been using the distributed shell application and Oozie to run
>> native C++ applications in the cluster. Is YARN able to see the resources
>> these native applications use. For example, if I use Oozie's shell action,
>> the NodeManager hosts the mapper container and allocates a certain amount
>> of memory and vcores (as configured). What happens if my C++ application
>> uses more memory or vcores than the NodeManager allocated?
>>
>>  I was looking in the Hadoop code and I couldn't find my way to answer.
>> Although, it seems the LinuxContainerExecutor may be the answer to my
>> question since it uses cgroups.
>>
>>  I'm interested to know how YARN reacts to non-Java applications running
>> inside of it.
>>
>>  Thanks,
>> Kevin
>>
>
>

Re: Using YARN with native applications

Posted by Kevin <ke...@gmail.com>.
Varun, thank you for helping me understand this. You pointed out a couple
of new things to me. I finally found that monitoring thread in the code
(ContainersMonitorImpl.java). I can now see and gain a better understanding
of YARN checks on a container's resources.

On Wed, May 27, 2015 at 1:23 AM Varun Vasudev <vv...@hortonworks.com>
wrote:

>  YARN should kill the container. I’m not sure what JVM you’re referring
> to, but the NodeManager writes and then spawns a shell script that will
> invoke your shell script which in turn(presumably) will invoke your C++
> application. A monitoring thread then looks at the memory usage of the
> process tree and compares it to the limits for the container.
>
>  -Varun
>
>   From: Kevin
> Reply-To: "user@hadoop.apache.org"
> Date: Tuesday, May 26, 2015 at 7:22 AM
> To: "user@hadoop.apache.org"
> Subject: Re: Using YARN with native applications
>
>   Thanks for the reply, Varun. So if I use the DefaultContainerExecutor
> and run a C++ application via a shell script inside a container whose
> virtual memory limit is, for example, 2 GB, and that application does a
> malloc for 3 GB, YARN will kill the container? I always just thought that
> YARN kept its eye on the JVM it spins up for the container (under the
> DefaultContainerExecutor).
>
>  -Kevin
>
> On Mon, May 25, 2015 at 4:17 AM, Varun Vasudev <vv...@hortonworks.com>
> wrote:
>
>>  Hi Kevin,
>>
>>  By default, the NodeManager monitors physical and virtual memory usage
>> of containers. Containers that exceed either limit are killed. Admins can
>> disable the checks by setting yarn.nodemanager.pmem-check-enabled and/or yarn.nodemanager.vmem-check-enabled
>> to false. The virtual memory limit for a container is determined using the
>> config variable yarn.nodemanager.vmem-pmem-ratio(default value is 2.1).
>>
>>  In case of vcores -
>>
>>    1. If you’re using Cgroups under LinuxContainerExecutor, by default,
>>    if there is spare CPU available on the node, your container will be allowed
>>    to use it. Admins can restrict containers to use only the CPU allocated to
>>    them by setting yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage
>>    to true. This setting is only applicable when using Cgroups under
>>    LinuxContainerExecutor.
>>    2.  If you aren’t using Cgroups under LinuxContainerExecutor, there
>>    is no limiting of the amount of the CPU that containers can use.
>>
>>  -Varun
>>
>>   From: Kevin
>> Reply-To: "user@hadoop.apache.org"
>> Date: Friday, May 22, 2015 at 3:30 AM
>> To: "user@hadoop.apache.org"
>> Subject: Using YARN with native applications
>>
>>   Hello,
>>
>>  I have been using the distributed shell application and Oozie to run
>> native C++ applications in the cluster. Is YARN able to see the resources
>> these native applications use. For example, if I use Oozie's shell action,
>> the NodeManager hosts the mapper container and allocates a certain amount
>> of memory and vcores (as configured). What happens if my C++ application
>> uses more memory or vcores than the NodeManager allocated?
>>
>>  I was looking in the Hadoop code and I couldn't find my way to answer.
>> Although, it seems the LinuxContainerExecutor may be the answer to my
>> question since it uses cgroups.
>>
>>  I'm interested to know how YARN reacts to non-Java applications running
>> inside of it.
>>
>>  Thanks,
>> Kevin
>>
>
>

Re: Using YARN with native applications

Posted by Varun Vasudev <vv...@hortonworks.com>.
YARN should kill the container. I’m not sure what JVM you’re referring to, but the NodeManager writes and then spawns a shell script that will invoke your shell script which in turn(presumably) will invoke your C++ application. A monitoring thread then looks at the memory usage of the process tree and compares it to the limits for the container.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Tuesday, May 26, 2015 at 7:22 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Re: Using YARN with native applications

Thanks for the reply, Varun. So if I use the DefaultContainerExecutor and run a C++ application via a shell script inside a container whose virtual memory limit is, for example, 2 GB, and that application does a malloc for 3 GB, YARN will kill the container? I always just thought that YARN kept its eye on the JVM it spins up for the container (under the DefaultContainerExecutor).

-Kevin

On Mon, May 25, 2015 at 4:17 AM, Varun Vasudev <vv...@hortonworks.com>> wrote:
Hi Kevin,

By default, the NodeManager monitors physical and virtual memory usage of containers. Containers that exceed either limit are killed. Admins can disable the checks by setting yarn.nodemanager.pmem-check-enabled and/or yarn.nodemanager.vmem-check-enabled to false. The virtual memory limit for a container is determined using the config variable yarn.nodemanager.vmem-pmem-ratio(default value is 2.1).

In case of vcores -

  1.  If you’re using Cgroups under LinuxContainerExecutor, by default, if there is spare CPU available on the node, your container will be allowed to use it. Admins can restrict containers to use only the CPU allocated to them by setting yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage to true. This setting is only applicable when using Cgroups under LinuxContainerExecutor.
  2.   If you aren’t using Cgroups under LinuxContainerExecutor, there is no limiting of the amount of the CPU that containers can use.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Friday, May 22, 2015 at 3:30 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Using YARN with native applications

Hello,

I have been using the distributed shell application and Oozie to run native C++ applications in the cluster. Is YARN able to see the resources these native applications use. For example, if I use Oozie's shell action, the NodeManager hosts the mapper container and allocates a certain amount of memory and vcores (as configured). What happens if my C++ application uses more memory or vcores than the NodeManager allocated?

I was looking in the Hadoop code and I couldn't find my way to answer. Although, it seems the LinuxContainerExecutor may be the answer to my question since it uses cgroups.

I'm interested to know how YARN reacts to non-Java applications running inside of it.

Thanks,
Kevin


Re: Using YARN with native applications

Posted by Varun Vasudev <vv...@hortonworks.com>.
YARN should kill the container. I’m not sure what JVM you’re referring to, but the NodeManager writes and then spawns a shell script that will invoke your shell script which in turn(presumably) will invoke your C++ application. A monitoring thread then looks at the memory usage of the process tree and compares it to the limits for the container.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Tuesday, May 26, 2015 at 7:22 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Re: Using YARN with native applications

Thanks for the reply, Varun. So if I use the DefaultContainerExecutor and run a C++ application via a shell script inside a container whose virtual memory limit is, for example, 2 GB, and that application does a malloc for 3 GB, YARN will kill the container? I always just thought that YARN kept its eye on the JVM it spins up for the container (under the DefaultContainerExecutor).

-Kevin

On Mon, May 25, 2015 at 4:17 AM, Varun Vasudev <vv...@hortonworks.com>> wrote:
Hi Kevin,

By default, the NodeManager monitors physical and virtual memory usage of containers. Containers that exceed either limit are killed. Admins can disable the checks by setting yarn.nodemanager.pmem-check-enabled and/or yarn.nodemanager.vmem-check-enabled to false. The virtual memory limit for a container is determined using the config variable yarn.nodemanager.vmem-pmem-ratio(default value is 2.1).

In case of vcores -

  1.  If you’re using Cgroups under LinuxContainerExecutor, by default, if there is spare CPU available on the node, your container will be allowed to use it. Admins can restrict containers to use only the CPU allocated to them by setting yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage to true. This setting is only applicable when using Cgroups under LinuxContainerExecutor.
  2.   If you aren’t using Cgroups under LinuxContainerExecutor, there is no limiting of the amount of the CPU that containers can use.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Friday, May 22, 2015 at 3:30 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Using YARN with native applications

Hello,

I have been using the distributed shell application and Oozie to run native C++ applications in the cluster. Is YARN able to see the resources these native applications use. For example, if I use Oozie's shell action, the NodeManager hosts the mapper container and allocates a certain amount of memory and vcores (as configured). What happens if my C++ application uses more memory or vcores than the NodeManager allocated?

I was looking in the Hadoop code and I couldn't find my way to answer. Although, it seems the LinuxContainerExecutor may be the answer to my question since it uses cgroups.

I'm interested to know how YARN reacts to non-Java applications running inside of it.

Thanks,
Kevin


Re: Using YARN with native applications

Posted by Varun Vasudev <vv...@hortonworks.com>.
YARN should kill the container. I’m not sure what JVM you’re referring to, but the NodeManager writes and then spawns a shell script that will invoke your shell script which in turn(presumably) will invoke your C++ application. A monitoring thread then looks at the memory usage of the process tree and compares it to the limits for the container.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Tuesday, May 26, 2015 at 7:22 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Re: Using YARN with native applications

Thanks for the reply, Varun. So if I use the DefaultContainerExecutor and run a C++ application via a shell script inside a container whose virtual memory limit is, for example, 2 GB, and that application does a malloc for 3 GB, YARN will kill the container? I always just thought that YARN kept its eye on the JVM it spins up for the container (under the DefaultContainerExecutor).

-Kevin

On Mon, May 25, 2015 at 4:17 AM, Varun Vasudev <vv...@hortonworks.com>> wrote:
Hi Kevin,

By default, the NodeManager monitors physical and virtual memory usage of containers. Containers that exceed either limit are killed. Admins can disable the checks by setting yarn.nodemanager.pmem-check-enabled and/or yarn.nodemanager.vmem-check-enabled to false. The virtual memory limit for a container is determined using the config variable yarn.nodemanager.vmem-pmem-ratio(default value is 2.1).

In case of vcores -

  1.  If you’re using Cgroups under LinuxContainerExecutor, by default, if there is spare CPU available on the node, your container will be allowed to use it. Admins can restrict containers to use only the CPU allocated to them by setting yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage to true. This setting is only applicable when using Cgroups under LinuxContainerExecutor.
  2.   If you aren’t using Cgroups under LinuxContainerExecutor, there is no limiting of the amount of the CPU that containers can use.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Friday, May 22, 2015 at 3:30 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Using YARN with native applications

Hello,

I have been using the distributed shell application and Oozie to run native C++ applications in the cluster. Is YARN able to see the resources these native applications use. For example, if I use Oozie's shell action, the NodeManager hosts the mapper container and allocates a certain amount of memory and vcores (as configured). What happens if my C++ application uses more memory or vcores than the NodeManager allocated?

I was looking in the Hadoop code and I couldn't find my way to answer. Although, it seems the LinuxContainerExecutor may be the answer to my question since it uses cgroups.

I'm interested to know how YARN reacts to non-Java applications running inside of it.

Thanks,
Kevin


Re: Using YARN with native applications

Posted by Varun Vasudev <vv...@hortonworks.com>.
YARN should kill the container. I’m not sure what JVM you’re referring to, but the NodeManager writes and then spawns a shell script that will invoke your shell script which in turn(presumably) will invoke your C++ application. A monitoring thread then looks at the memory usage of the process tree and compares it to the limits for the container.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Tuesday, May 26, 2015 at 7:22 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Re: Using YARN with native applications

Thanks for the reply, Varun. So if I use the DefaultContainerExecutor and run a C++ application via a shell script inside a container whose virtual memory limit is, for example, 2 GB, and that application does a malloc for 3 GB, YARN will kill the container? I always just thought that YARN kept its eye on the JVM it spins up for the container (under the DefaultContainerExecutor).

-Kevin

On Mon, May 25, 2015 at 4:17 AM, Varun Vasudev <vv...@hortonworks.com>> wrote:
Hi Kevin,

By default, the NodeManager monitors physical and virtual memory usage of containers. Containers that exceed either limit are killed. Admins can disable the checks by setting yarn.nodemanager.pmem-check-enabled and/or yarn.nodemanager.vmem-check-enabled to false. The virtual memory limit for a container is determined using the config variable yarn.nodemanager.vmem-pmem-ratio(default value is 2.1).

In case of vcores -

  1.  If you’re using Cgroups under LinuxContainerExecutor, by default, if there is spare CPU available on the node, your container will be allowed to use it. Admins can restrict containers to use only the CPU allocated to them by setting yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage to true. This setting is only applicable when using Cgroups under LinuxContainerExecutor.
  2.   If you aren’t using Cgroups under LinuxContainerExecutor, there is no limiting of the amount of the CPU that containers can use.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Friday, May 22, 2015 at 3:30 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Using YARN with native applications

Hello,

I have been using the distributed shell application and Oozie to run native C++ applications in the cluster. Is YARN able to see the resources these native applications use. For example, if I use Oozie's shell action, the NodeManager hosts the mapper container and allocates a certain amount of memory and vcores (as configured). What happens if my C++ application uses more memory or vcores than the NodeManager allocated?

I was looking in the Hadoop code and I couldn't find my way to answer. Although, it seems the LinuxContainerExecutor may be the answer to my question since it uses cgroups.

I'm interested to know how YARN reacts to non-Java applications running inside of it.

Thanks,
Kevin


Re: Using YARN with native applications

Posted by Kevin <ke...@gmail.com>.
Thanks for the reply, Varun. So if I use the DefaultContainerExecutor and
run a C++ application via a shell script inside a container whose virtual
memory limit is, for example, 2 GB, and that application does a malloc for
3 GB, YARN will kill the container? I always just thought that YARN kept
its eye on the JVM it spins up for the container (under the
DefaultContainerExecutor).

-Kevin

On Mon, May 25, 2015 at 4:17 AM, Varun Vasudev <vv...@hortonworks.com>
wrote:

>  Hi Kevin,
>
>  By default, the NodeManager monitors physical and virtual memory usage
> of containers. Containers that exceed either limit are killed. Admins can
> disable the checks by setting yarn.nodemanager.pmem-check-enabled and/or yarn.nodemanager.vmem-check-enabled
> to false. The virtual memory limit for a container is determined using the
> config variable yarn.nodemanager.vmem-pmem-ratio(default value is 2.1).
>
>  In case of vcores -
>
>    1. If you’re using Cgroups under LinuxContainerExecutor, by default,
>    if there is spare CPU available on the node, your container will be allowed
>    to use it. Admins can restrict containers to use only the CPU allocated to
>    them by setting yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage
>    to true. This setting is only applicable when using Cgroups under
>    LinuxContainerExecutor.
>    2.  If you aren’t using Cgroups under LinuxContainerExecutor, there is
>    no limiting of the amount of the CPU that containers can use.
>
>  -Varun
>
>   From: Kevin
> Reply-To: "user@hadoop.apache.org"
> Date: Friday, May 22, 2015 at 3:30 AM
> To: "user@hadoop.apache.org"
> Subject: Using YARN with native applications
>
>   Hello,
>
>  I have been using the distributed shell application and Oozie to run
> native C++ applications in the cluster. Is YARN able to see the resources
> these native applications use. For example, if I use Oozie's shell action,
> the NodeManager hosts the mapper container and allocates a certain amount
> of memory and vcores (as configured). What happens if my C++ application
> uses more memory or vcores than the NodeManager allocated?
>
>  I was looking in the Hadoop code and I couldn't find my way to answer.
> Although, it seems the LinuxContainerExecutor may be the answer to my
> question since it uses cgroups.
>
>  I'm interested to know how YARN reacts to non-Java applications running
> inside of it.
>
>  Thanks,
> Kevin
>

Re: Using YARN with native applications

Posted by Kevin <ke...@gmail.com>.
Thanks for the reply, Varun. So if I use the DefaultContainerExecutor and
run a C++ application via a shell script inside a container whose virtual
memory limit is, for example, 2 GB, and that application does a malloc for
3 GB, YARN will kill the container? I always just thought that YARN kept
its eye on the JVM it spins up for the container (under the
DefaultContainerExecutor).

-Kevin

On Mon, May 25, 2015 at 4:17 AM, Varun Vasudev <vv...@hortonworks.com>
wrote:

>  Hi Kevin,
>
>  By default, the NodeManager monitors physical and virtual memory usage
> of containers. Containers that exceed either limit are killed. Admins can
> disable the checks by setting yarn.nodemanager.pmem-check-enabled and/or yarn.nodemanager.vmem-check-enabled
> to false. The virtual memory limit for a container is determined using the
> config variable yarn.nodemanager.vmem-pmem-ratio(default value is 2.1).
>
>  In case of vcores -
>
>    1. If you’re using Cgroups under LinuxContainerExecutor, by default,
>    if there is spare CPU available on the node, your container will be allowed
>    to use it. Admins can restrict containers to use only the CPU allocated to
>    them by setting yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage
>    to true. This setting is only applicable when using Cgroups under
>    LinuxContainerExecutor.
>    2.  If you aren’t using Cgroups under LinuxContainerExecutor, there is
>    no limiting of the amount of the CPU that containers can use.
>
>  -Varun
>
>   From: Kevin
> Reply-To: "user@hadoop.apache.org"
> Date: Friday, May 22, 2015 at 3:30 AM
> To: "user@hadoop.apache.org"
> Subject: Using YARN with native applications
>
>   Hello,
>
>  I have been using the distributed shell application and Oozie to run
> native C++ applications in the cluster. Is YARN able to see the resources
> these native applications use. For example, if I use Oozie's shell action,
> the NodeManager hosts the mapper container and allocates a certain amount
> of memory and vcores (as configured). What happens if my C++ application
> uses more memory or vcores than the NodeManager allocated?
>
>  I was looking in the Hadoop code and I couldn't find my way to answer.
> Although, it seems the LinuxContainerExecutor may be the answer to my
> question since it uses cgroups.
>
>  I'm interested to know how YARN reacts to non-Java applications running
> inside of it.
>
>  Thanks,
> Kevin
>

Re: Using YARN with native applications

Posted by Kevin <ke...@gmail.com>.
Thanks for the reply, Varun. So if I use the DefaultContainerExecutor and
run a C++ application via a shell script inside a container whose virtual
memory limit is, for example, 2 GB, and that application does a malloc for
3 GB, YARN will kill the container? I always just thought that YARN kept
its eye on the JVM it spins up for the container (under the
DefaultContainerExecutor).

-Kevin

On Mon, May 25, 2015 at 4:17 AM, Varun Vasudev <vv...@hortonworks.com>
wrote:

>  Hi Kevin,
>
>  By default, the NodeManager monitors physical and virtual memory usage
> of containers. Containers that exceed either limit are killed. Admins can
> disable the checks by setting yarn.nodemanager.pmem-check-enabled and/or yarn.nodemanager.vmem-check-enabled
> to false. The virtual memory limit for a container is determined using the
> config variable yarn.nodemanager.vmem-pmem-ratio(default value is 2.1).
>
>  In case of vcores -
>
>    1. If you’re using Cgroups under LinuxContainerExecutor, by default,
>    if there is spare CPU available on the node, your container will be allowed
>    to use it. Admins can restrict containers to use only the CPU allocated to
>    them by setting yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage
>    to true. This setting is only applicable when using Cgroups under
>    LinuxContainerExecutor.
>    2.  If you aren’t using Cgroups under LinuxContainerExecutor, there is
>    no limiting of the amount of the CPU that containers can use.
>
>  -Varun
>
>   From: Kevin
> Reply-To: "user@hadoop.apache.org"
> Date: Friday, May 22, 2015 at 3:30 AM
> To: "user@hadoop.apache.org"
> Subject: Using YARN with native applications
>
>   Hello,
>
>  I have been using the distributed shell application and Oozie to run
> native C++ applications in the cluster. Is YARN able to see the resources
> these native applications use. For example, if I use Oozie's shell action,
> the NodeManager hosts the mapper container and allocates a certain amount
> of memory and vcores (as configured). What happens if my C++ application
> uses more memory or vcores than the NodeManager allocated?
>
>  I was looking in the Hadoop code and I couldn't find my way to answer.
> Although, it seems the LinuxContainerExecutor may be the answer to my
> question since it uses cgroups.
>
>  I'm interested to know how YARN reacts to non-Java applications running
> inside of it.
>
>  Thanks,
> Kevin
>

Re: Using YARN with native applications

Posted by Kevin <ke...@gmail.com>.
Thanks for the reply, Varun. So if I use the DefaultContainerExecutor and
run a C++ application via a shell script inside a container whose virtual
memory limit is, for example, 2 GB, and that application does a malloc for
3 GB, YARN will kill the container? I always just thought that YARN kept
its eye on the JVM it spins up for the container (under the
DefaultContainerExecutor).

-Kevin

On Mon, May 25, 2015 at 4:17 AM, Varun Vasudev <vv...@hortonworks.com>
wrote:

>  Hi Kevin,
>
>  By default, the NodeManager monitors physical and virtual memory usage
> of containers. Containers that exceed either limit are killed. Admins can
> disable the checks by setting yarn.nodemanager.pmem-check-enabled and/or yarn.nodemanager.vmem-check-enabled
> to false. The virtual memory limit for a container is determined using the
> config variable yarn.nodemanager.vmem-pmem-ratio(default value is 2.1).
>
>  In case of vcores -
>
>    1. If you’re using Cgroups under LinuxContainerExecutor, by default,
>    if there is spare CPU available on the node, your container will be allowed
>    to use it. Admins can restrict containers to use only the CPU allocated to
>    them by setting yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage
>    to true. This setting is only applicable when using Cgroups under
>    LinuxContainerExecutor.
>    2.  If you aren’t using Cgroups under LinuxContainerExecutor, there is
>    no limiting of the amount of the CPU that containers can use.
>
>  -Varun
>
>   From: Kevin
> Reply-To: "user@hadoop.apache.org"
> Date: Friday, May 22, 2015 at 3:30 AM
> To: "user@hadoop.apache.org"
> Subject: Using YARN with native applications
>
>   Hello,
>
>  I have been using the distributed shell application and Oozie to run
> native C++ applications in the cluster. Is YARN able to see the resources
> these native applications use. For example, if I use Oozie's shell action,
> the NodeManager hosts the mapper container and allocates a certain amount
> of memory and vcores (as configured). What happens if my C++ application
> uses more memory or vcores than the NodeManager allocated?
>
>  I was looking in the Hadoop code and I couldn't find my way to answer.
> Although, it seems the LinuxContainerExecutor may be the answer to my
> question since it uses cgroups.
>
>  I'm interested to know how YARN reacts to non-Java applications running
> inside of it.
>
>  Thanks,
> Kevin
>

Re: Using YARN with native applications

Posted by Varun Vasudev <vv...@hortonworks.com>.
Hi Kevin,

By default, the NodeManager monitors physical and virtual memory usage of containers. Containers that exceed either limit are killed. Admins can disable the checks by setting yarn.nodemanager.pmem-check-enabled and/or yarn.nodemanager.vmem-check-enabled to false. The virtual memory limit for a container is determined using the config variable yarn.nodemanager.vmem-pmem-ratio(default value is 2.1).

In case of vcores -

  1.  If you’re using Cgroups under LinuxContainerExecutor, by default, if there is spare CPU available on the node, your container will be allowed to use it. Admins can restrict containers to use only the CPU allocated to them by setting yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage to true. This setting is only applicable when using Cgroups under LinuxContainerExecutor.
  2.   If you aren’t using Cgroups under LinuxContainerExecutor, there is no limiting of the amount of the CPU that containers can use.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Friday, May 22, 2015 at 3:30 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Using YARN with native applications

Hello,

I have been using the distributed shell application and Oozie to run native C++ applications in the cluster. Is YARN able to see the resources these native applications use. For example, if I use Oozie's shell action, the NodeManager hosts the mapper container and allocates a certain amount of memory and vcores (as configured). What happens if my C++ application uses more memory or vcores than the NodeManager allocated?

I was looking in the Hadoop code and I couldn't find my way to answer. Although, it seems the LinuxContainerExecutor may be the answer to my question since it uses cgroups.

I'm interested to know how YARN reacts to non-Java applications running inside of it.

Thanks,
Kevin

Re: Using YARN with native applications

Posted by Varun Vasudev <vv...@hortonworks.com>.
Hi Kevin,

By default, the NodeManager monitors physical and virtual memory usage of containers. Containers that exceed either limit are killed. Admins can disable the checks by setting yarn.nodemanager.pmem-check-enabled and/or yarn.nodemanager.vmem-check-enabled to false. The virtual memory limit for a container is determined using the config variable yarn.nodemanager.vmem-pmem-ratio(default value is 2.1).

In case of vcores -

  1.  If you’re using Cgroups under LinuxContainerExecutor, by default, if there is spare CPU available on the node, your container will be allowed to use it. Admins can restrict containers to use only the CPU allocated to them by setting yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage to true. This setting is only applicable when using Cgroups under LinuxContainerExecutor.
  2.   If you aren’t using Cgroups under LinuxContainerExecutor, there is no limiting of the amount of the CPU that containers can use.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Friday, May 22, 2015 at 3:30 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Using YARN with native applications

Hello,

I have been using the distributed shell application and Oozie to run native C++ applications in the cluster. Is YARN able to see the resources these native applications use. For example, if I use Oozie's shell action, the NodeManager hosts the mapper container and allocates a certain amount of memory and vcores (as configured). What happens if my C++ application uses more memory or vcores than the NodeManager allocated?

I was looking in the Hadoop code and I couldn't find my way to answer. Although, it seems the LinuxContainerExecutor may be the answer to my question since it uses cgroups.

I'm interested to know how YARN reacts to non-Java applications running inside of it.

Thanks,
Kevin

Re: Using YARN with native applications

Posted by Varun Vasudev <vv...@hortonworks.com>.
Hi Kevin,

By default, the NodeManager monitors physical and virtual memory usage of containers. Containers that exceed either limit are killed. Admins can disable the checks by setting yarn.nodemanager.pmem-check-enabled and/or yarn.nodemanager.vmem-check-enabled to false. The virtual memory limit for a container is determined using the config variable yarn.nodemanager.vmem-pmem-ratio(default value is 2.1).

In case of vcores -

  1.  If you’re using Cgroups under LinuxContainerExecutor, by default, if there is spare CPU available on the node, your container will be allowed to use it. Admins can restrict containers to use only the CPU allocated to them by setting yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage to true. This setting is only applicable when using Cgroups under LinuxContainerExecutor.
  2.   If you aren’t using Cgroups under LinuxContainerExecutor, there is no limiting of the amount of the CPU that containers can use.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Friday, May 22, 2015 at 3:30 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Using YARN with native applications

Hello,

I have been using the distributed shell application and Oozie to run native C++ applications in the cluster. Is YARN able to see the resources these native applications use. For example, if I use Oozie's shell action, the NodeManager hosts the mapper container and allocates a certain amount of memory and vcores (as configured). What happens if my C++ application uses more memory or vcores than the NodeManager allocated?

I was looking in the Hadoop code and I couldn't find my way to answer. Although, it seems the LinuxContainerExecutor may be the answer to my question since it uses cgroups.

I'm interested to know how YARN reacts to non-Java applications running inside of it.

Thanks,
Kevin

Re: Using YARN with native applications

Posted by Varun Vasudev <vv...@hortonworks.com>.
Hi Kevin,

By default, the NodeManager monitors physical and virtual memory usage of containers. Containers that exceed either limit are killed. Admins can disable the checks by setting yarn.nodemanager.pmem-check-enabled and/or yarn.nodemanager.vmem-check-enabled to false. The virtual memory limit for a container is determined using the config variable yarn.nodemanager.vmem-pmem-ratio(default value is 2.1).

In case of vcores -

  1.  If you’re using Cgroups under LinuxContainerExecutor, by default, if there is spare CPU available on the node, your container will be allowed to use it. Admins can restrict containers to use only the CPU allocated to them by setting yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage to true. This setting is only applicable when using Cgroups under LinuxContainerExecutor.
  2.   If you aren’t using Cgroups under LinuxContainerExecutor, there is no limiting of the amount of the CPU that containers can use.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Date: Friday, May 22, 2015 at 3:30 AM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>"
Subject: Using YARN with native applications

Hello,

I have been using the distributed shell application and Oozie to run native C++ applications in the cluster. Is YARN able to see the resources these native applications use. For example, if I use Oozie's shell action, the NodeManager hosts the mapper container and allocates a certain amount of memory and vcores (as configured). What happens if my C++ application uses more memory or vcores than the NodeManager allocated?

I was looking in the Hadoop code and I couldn't find my way to answer. Although, it seems the LinuxContainerExecutor may be the answer to my question since it uses cgroups.

I'm interested to know how YARN reacts to non-Java applications running inside of it.

Thanks,
Kevin