You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Krishna Kishore Bonagiri <wr...@gmail.com> on 2013/03/08 08:42:00 UTC
Application Master getting killed randomly reporting excess usage of memory
Hi,
I am running an application on YARN in a loop for 500 times. It ran 321
times correctly but the 322nd time it is saying that the AM container
exceeded it's memory limit. I am sure it wouldn't really have exceeded the
limit because it ran fine for 321 times. Also, it never reported this kind
of error in my previous runs in this kind of loops. Is this kind of problem
seen for some other reasons? I am using hadoop-2.0.0-alpha version. Please
help.
2013-03-07 10:55:35,853 INFO Client (Client.java:main(143)) - Initializing
Client
2013-03-07 10:55:35,867 INFO Client (Client.java:launchAndMonitorAM(463))
- Starting Client
2013-03-07 10:55:35,957 INFO Client (Client.java:connectToASM(564)) -
Connecting to ResourceManager at isredeng/127.0.1.1:8032
2013-03-07 10:55:36,540 INFO Client (Client.java:dumpClusterInfo(246)) -
Got Cluster metric info from ASM, numNodeManagers=1
2013-03-07 10:55:36,561 INFO Client (Client.java:dumpClusterInfo(251)) -
Got Cluster node info from ASM
2013-03-07 10:55:36,738 INFO Client (Client.java:dumpClusterInfo(253)) -
Got node report from ASM for, nodeId=isredeng:33967,
nodeAddress=isredeng:8042, nodeRackName=/default-rack,
nodeNumContainers=14, nodeHealthStatus=is_node_healthy: true,
health_report: "", last_health_report_time: 1362671618339,
2013-03-07 10:55:36,746 INFO Client (Client.java:dumpClusterInfo(263)) -
Queue info, queueName=default, queueCurrentCapacity=0.21875,
queueMaxCapacity=1.0, queueApplicationCount=0, queueChildQueueCount=0
2013-03-07 10:55:36,755 INFO Client (Client.java:dumpClusterInfo(275)) -
User ACL Info for Queue, queueName=default, userAcl=SUBMIT_APPLICATIONS
2013-03-07 10:55:36,755 INFO Client (Client.java:dumpClusterInfo(275)) -
User ACL Info for Queue, queueName=default, userAcl=ADMINISTER_QUEUE
2013-03-07 10:55:36,763 INFO Client (Client.java:getApplication(577)) -
Got new application id=application_1362668734615_0322
2013-03-07 10:55:36,763 INFO Client (Client.java:launchAndMonitorAM(476))
- Min mem capabililty of resources in this cluster 128
2013-03-07 10:55:36,764 INFO Client (Client.java:launchAndMonitorAM(477))
- Max mem capabililty of resources in this cluster 10240
2013-03-07 10:55:36,764 INFO Client (Client.java:launchAndMonitorAM(484))
- Setting up application submission context for ASM
2013-03-07 10:55:37,117 INFO Client (Client.java:prepareJarResource(288))
- Copy App Master jar from local filesystem and add to local environment
2013-03-07 10:55:37,390 INFO Client (Client.java:launchAndMonitorAM(519))
- Set the environment for the application master
2013-03-07 10:55:37,391 INFO Client
(Client.java:getTestRuntimeClasspath(592)) - Trying to generate classpath
for app master from current thread's classpath
2013-03-07 10:55:37,392 INFO Client
(Client.java:getTestRuntimeClasspath(604)) - Readable bytes from stream :
8559
2013-03-07 10:55:37,394 INFO Client (Client.java:prepareCommand(346)) -
Setting up app master command
2013-03-07 10:55:37,395 INFO Client (Client.java:prepareCommand(364)) -
Completed setting up app master command ${JAVA_HOME}/bin/java
ApplicationMaster --osh_am_port 10011 --osh_env
LD_LIBRARY_PATH=/home_/dsadm/kishore/yarn_feb14/orch_master/apt/lib::/home_/dsadm/kishore/yarn_feb14/orch_master/apt/lib:
--osh_env APT_ORCHHOME=/home_/dsadm/kishore/yarn_feb14/orch_master/apt
1><LOG_DIR>/AppMaster.stdout 2><LOG_DIR>/AppMaster.stderr
2013-03-07 10:55:37,397 INFO Client
(Client.java:submitAndMonitorApplication(385)) - Submitting application to
ASM
2013-03-07 10:55:38,458 INFO Client (Client.java:monitorApplication(413))
- Got application report from ASM for, appId=322, appDiagnostics=,
appMasterHost=N/A, clientToken=null, appQueue=default, appMasterRpcPort=0,
appStartTime=1362671737443, yarnAppState=SUBMITTED,
distributedFinalState=UNDEFINED, appTrackingUrl=
isredeng.swg.usma.ibm.com:8088/proxy/application_1362668734615_0322/,
appUser=dsadm
2013-03-07 10:55:39,460 INFO Client (Client.java:monitorApplication(413))
- Got application report from ASM for, appId=322, appDiagnostics=,
appMasterHost=N/A, clientToken=null, appQueue=default, appMasterRpcPort=0,
appStartTime=1362671737443, yarnAppState=SUBMITTED,
distributedFinalState=UNDEFINED, appTrackingUrl=
isredeng.swg.usma.ibm.com:8088/proxy/application_1362668734615_0322/,
appUser=dsadm
2013-03-07 10:55:40,463 INFO Client (Client.java:monitorApplication(413))
- Got application report from ASM for, appId=322, appDiagnostics=,
appMasterHost=N/A, clientToken=null, appQueue=default, appMasterRpcPort=0,
appStartTime=1362671737443, yarnAppState=SUBMITTED,
distributedFinalState=UNDEFINED, appTrackingUrl=
isredeng.swg.usma.ibm.com:8088/proxy/application_1362668734615_0322/,
appUser=dsadm
2013-03-07 10:55:41,467 INFO Client (Client.java:monitorApplication(413))
- Got application report from ASM for, appId=322,
appDiagnostics=Application application_1362668734615_0322 failed 1 times
due to AM Container for appattempt_1362668734615_0322_000001 exited with
exitCode: 143 due to: Container
[pid=3606,containerID=container_1362668734615_0322_01_000001] is running
beyond virtual memory limits. Current usage: 37.0mb of 128.0mb physical
memory used; 998.4mb of 268.8mb virtual memory used. Killing container.
Dump of the process-tree for container_1362668734615_0322_01_000001 :
|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS)
SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
|- 3612 3606 3606 3606 (java) 150 13 938192896 9164
/home/kbonagir/yarn/jdk//bin/java ApplicationMaster --osh_am_port 10011
--osh_env
LD_LIBRARY_PATH=/home_/dsadm/kishore/yarn_feb14/orch_master/apt/lib::/home_/dsadm/kishore/yarn_feb14/orch_master/apt/lib:
--osh_env APT_ORCHHOME=/home_/dsadm/kishore/yarn_feb14/orch_master/apt
Thanks,
Kishore