You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-dev@hadoop.apache.org by Jimson James <ma...@gmail.com> on 2013/01/03 02:58:44 UTC

Help: AM Container is running beyond virtual memory limits

Hi all,

I was playing with distributed shell application (hadoop-2.0.0-cdh4.1.2).
This is the error I'm receiving at the moment.

13/01/01 17:09:09 INFO distributedshell.Client: Got application report
from ASM for, appId=5, clientToken=null,

appDiagnostics=Application application_1357039792045_0005 failed 1
times due to AM Container for appattempt_1357039792045_0005_000001
exited with  exitCode: 143 due to:

Container [pid=24845,containerID=container_1357039792045_0005_01_000001]
is running beyond virtual memory limits.

Current usage: 77.8mb of 512.0mb physical memory used; 1.1gb of 1.0gb
virtual memory used.

Killing container.
Dump of the process-tree for container_1357039792045_0005_01_000001 :
|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS)
SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES)
FULL_CMD_LINE
|- 24849 24845 24845 24845 (java) 165 12 1048494080 19590
/usr/java/bin/java -Xmx512m
org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster
--container_memory 128 --num_containers 1 --priority 0 --shell_command
ping --shell_args localhost --debug
|- 24845 23394 24845 24845 (bash) 0 0 108654592 315 /bin/bash -c
/usr/java/bin/java -Xmx512m
org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster
--container_memory 128 --num_containers 1 --priority 0 --shell_command
ping --shell_args localhost --debug
1>/tmp/logs/application_1357039792045_0005/container_1357039792045_0005_01_000001/AppMaster.stdout
2>/tmp/logs/application_1357039792045_0005/container_1357039792045_0005_01_000001/AppMaster.stderr

The interesting part is that, there seems to be no problem with the setup,
since a simple ls or uname command completed successfully and the output
was available in the container2 stdout.

Regarding the setup, yarn.nodenamager.vmem-pmem-ratio is 3 and the total
physical memory available is 2GB+, which I thinks is more than enough for
the example to run.

For the command in question, the "ping localhost" generated two replies, as
it can be seen from the
containerlogs/container_1357039792045_0005_01_000002/721917/stdout/?start=-4096.

So, what could be the problem?