You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Varun Vasudev <vv...@apache.org> on 2016/05/01 10:46:14 UTC

Re: YarnChild and Container running beyond physical memory limits

Hi Joseph,

YarnChild is a wrapper around the MR task process that actually carry out the work on the machine. From YarnChild.java -
/**
 * The main() for MapReduce task processes.
 */

In the snippets you provided, the memory monitor for YARN killed the map tasks because they exceeded the allocated memory - 
Container [pid=30518,containerID=container_1460573911020_0002_01_000033] is running beyond physical memory limits. Current usage: 6.6 GB of 2.9 GB physical memory used; 17.6 GB of 11.7 GB virtual memory used. Killing container.
And
Container [pid=10124,containerID=container_1460478789757_0001_01_000020] is running beyond physical memory limits. Current usage: 5.4 GB of 5 GB physical memory used; 8.4 GB of 20 GB virtual memory used. Killing container.
> and it's always due to some other unrelated external process chewing up RAM.
This should not be the case. The way YARN determines memory usage is by walking down the process tree of the container. We don’t look at memory being used by external processes.
I would recommend increasing the amount of memory allocated for your map tasks until the job finishes(to figure out the upper limit of your map tasks) and going through your map code to see where it’s possible for memory usage to spike.
-Varun

From:  Joseph Naegele <jn...@grierforensics.com>
Date:  Thursday, April 14, 2016 at 5:10 AM
To:  <us...@hadoop.apache.org>
Subject:  YarnChild and Container running beyond physical memory limits

Hi!

 

Can anyone tell me what exactly YarnChild is and how I can control the quantity of child JVMs running in each container? In this case I'm concerned with the map phase of my MR job. I'm having issues with my containers running beyond *physical* memory limits and I'm trying to determine the cause.

 

Is each child JVM just an individual map task? If so, why do I see a variable number of them? I don't know if each of these JVMs is a clone of the original YarnChild process, what they are doing, why they are each using so much memory (1G).

 

Here is a sample excerpt of my MR job when YARN kills a container: https://gist.githubusercontent.com/naegelejd/ad3a58192a2df79775d80e3eac0ae49c/raw/808f998b1987c77ba1fe7fb41abab62ae07c5e02/job.log

Here's the same process tree reorganized and ordered by ancestry: https://gist.githubusercontent.com/naegelejd/37afb27a6cf16ce918daeaeaf7450cdc/raw/b8809ce023840799f2cbbee28e49930671198ead/job.clean.log

 

If I increase the amount of memory per container, in turn lowering the total number of containers, I see these errors less often as expected, BUT when I do see them, there are NO child JVM processes and it's always due to some other unrelated external process chewing up RAM. Here is an example of that: https://gist.githubusercontent.com/naegelejd/32d63b0f9b9c148d1c1c7c0de3c2c317/raw/934a93a7afe09c7cd62a50edc08ce902b9e71aac/job.log. You can see that the [redacted] process is the culprit in that case.

 

I can share my mapred/yarn configuration if it's helpful.

 

If anyone has any ideas I'd greatly appreciate them!

 

Thanks,

Joe