You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Szilard Nemeth (Jira)" <ji...@apache.org> on 2021/01/08 13:40:00 UTC

[jira] [Updated] (YARN-10264) Add container launch related env / classpath debug info to container logs when a container fails

     [ https://issues.apache.org/jira/browse/YARN-10264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Szilard Nemeth updated YARN-10264:
----------------------------------
        Parent: YARN-10323
    Issue Type: Sub-task  (was: Task)

> Add container launch related env / classpath debug info to container logs when a container fails
> ------------------------------------------------------------------------------------------------
>
>                 Key: YARN-10264
>                 URL: https://issues.apache.org/jira/browse/YARN-10264
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Szilard Nemeth
>            Assignee: Szilard Nemeth
>            Priority: Major
>
> Sometimes when a container fails to launch, it can be pretty hard to figure out why it has failed.
> Similar to YARN-4309, we can add a switch to control if the printing of environment variables and Java classpath should be done.
> As a bonus, [jdeps|https://docs.oracle.com/javase/8/docs/technotes/tools/unix/jdeps.html] could also be utilized to print some verbose info about the classpath. 
> When log aggregation occurs, all this information will automatically get collected and make debugging such container launch failures much easier.
> Below is an example output when the user faces a classpath configuration issue while launching an application: 
> {code:java}
> End of LogType:prelaunch.err
> ******************************************************************************
> 2020-04-19 05:49:12,145 DEBUG:app_info:Diagnostics of the failed app
> 2020-04-19 05:49:12,145 DEBUG:app_info:Application application_1587300264561_0001 failed 2 times due to AM Container for appattempt_1587300264561_0001_000002 exited with  exitCode: 1
> Failing this attempt.Diagnostics: [2020-04-19 12:45:01.955]Exception from container-launch.
> Container id: container_e60_1587300264561_0001_02_000001
> Exit code: 1
> Exception message: Launch container failed
> Shell output: main : command provided 1
> main : run as user is systest
> main : requested yarn user is systest
> Getting exit code file...
> Creating script paths...
> Writing pid file...
> Writing to tmp file /dataroot/ycloud/yarn/nm/nmPrivate/application_1587300264561_0001/container_e60_1587300264561_0001_02_000001/container_e60_1587300264561_0001_02_000001.pid.tmp
> Writing to cgroup task files...
> Creating local dirs...
> Launching container...
> Getting exit code file...
> Creating script paths...
> [2020-04-19 12:45:01.984]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
> Last 4096 bytes of prelaunch.err :
> Last 4096 bytes of stderr :
> Error: Could not find or load main class org.apache.hadoop.mapreduce.v2.app.MRAppMaster
> Please check whether your etc/hadoop/mapred-site.xml contains the below configuration:
> <property>
>   <name>yarn.app.mapreduce.am.env</name>
>   <value>HADOOP_MAPRED_HOME=${full path of your hadoop distribution directory}</value>
> </property>
> <property>
>   <name>mapreduce.map.env</name>
>   <value>HADOOP_MAPRED_HOME=${full path of your hadoop distribution directory}</value>
> </property>
> <property>
>   <name>mapreduce.reduce.env</name>
>   <value>HADOOP_MAPRED_HOME=${full path of your hadoop distribution directory}</value>
> </property>
> [2020-04-19 12:45:01.985]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
> Last 4096 bytes of prelaunch.err :
> Last 4096 bytes of stderr :
> Error: Could not find or load main class org.apache.hadoop.mapreduce.v2.app.MRAppMaster
> Please check whether your etc/hadoop/mapred-site.xml contains the below configuration:
> <property>
>   <name>yarn.app.mapreduce.am.env</name>
>   <value>HADOOP_MAPRED_HOME=${full path of your hadoop distribution directory}</value>
> </property>
> <property>
>   <name>mapreduce.map.env</name>
>   <value>HADOOP_MAPRED_HOME=${full path of your hadoop distribution directory}</value>
> </property>
> <property>
>   <name>mapreduce.reduce.env</name>
>   <value>HADOOP_MAPRED_HOME=${full path of your hadoop distribution directory}</value>
> </property>
> For more detailed output, check the application tracking page: http://quasar-plnefj-2.quasar-plnefj.root.hwx.site:8088/cluster/app/application_1587300264561_0001 Then click on links to logs of each attempt.
> ...
> 2020-04-19 05:49:12,148 INFO:util:* End test_app_API (yarn.suite.YarnAPITests) *
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org