You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-dev@hadoop.apache.org by "Yesha Vora (JIRA)" <ji...@apache.org> on 2017/08/21 21:17:00 UTC
[jira] [Created] (YARN-7065) [RM UI] App status not getting updated
in "All application" page
Yesha Vora created YARN-7065:
--------------------------------
Summary: [RM UI] App status not getting updated in "All application" page
Key: YARN-7065
URL: https://issues.apache.org/jira/browse/YARN-7065
Project: Hadoop YARN
Issue Type: Bug
Reporter: Yesha Vora
Scenario:
1) Run Spark Long Running application
2) Do RM and NN failover randomly
3) Validate App state in Yarn
The Spark applications are finished. Yarn-cli returns correct status of yarn application.
{code}
[hrt_qa@xxx hadoopqe]$ yarn application -status application_1503203977699_0014
17/08/21 16:56:10 INFO client.AHSProxy: Connecting to Application History server at host1 xxx.xx.xx.x:10200
17/08/21 16:56:10 INFO client.RequestHedgingRMFailoverProxyProvider: Looking for the active RM in [rm1, rm2]...
17/08/21 16:56:10 INFO client.RequestHedgingRMFailoverProxyProvider: Found active RM [rm1]
Application Report :
Application-Id : application_1503203977699_0014
Application-Name : org.apache.spark.sql.execution.datasources.hbase.examples.LRJobForDataSources
Application-Type : SPARK
User : hrt_qa
Queue : default
Application Priority : null
Start-Time : 1503215983532
Finish-Time : 1503250203806
Progress : 0%
State : FAILED
Final-State : FAILED
Tracking-URL : https://host1:8090/cluster/app/application_1503203977699_0014
RPC Port : -1
AM Host : N/A
Aggregate Resource Allocation : 174722793 MB-seconds, 170603 vcore-seconds
Log Aggregation Status : SUCCEEDED
Diagnostics : Application application_1503203977699_0014 failed 20 times due to AM Container for appattempt_1503203977699_0014_000020 exited with exitCode: 1
For more detailed output, check the application tracking page: https://host1:8090/cluster/app/application_1503203977699_0014 Then click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_e04_1503203977699_0014_20_000001
Exit code: 1
Stack trace: org.apache.hadoop.yarn.server.nodemanager.containermanager.runtime.ContainerExecutionException: Launch container failed
at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime.launchContainer(DefaultLinuxContainerRuntime.java:109)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime.launchContainer(DelegatingLinuxContainerRuntime.java:89)
at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:392)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:317)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:83)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Shell output: main : command provided 1
main : run as user is hrt_qa
main : requested yarn user is hrt_qa
Getting exit code file...
Creating script paths...
Writing pid file...
Writing to tmp file /grid/0/hadoop/yarn/local/nmPrivate/application_1503203977699_0014/container_e04_1503203977699_0014_20_000001/container_e04_1503203977699_0014_20_000001.pid.tmp
Writing to cgroup task files...
Creating local dirs...
Launching container...
Getting exit code file...
Creating script paths...
Container exited with a non-zero exit code 1
Failing this attempt. Failing the application.
Unmanaged Application : false
Application Node Label Expression : <Not set>
AM container Node Label Expression : <DEFAULT_PARTITION>{code}
However, RM UI "All application" page still shows the application in "RUNNING" State.
https://host1:8090/cluster
On clicking application_id ( https://host1:8090/cluster/app/application_1503203977699_0014) , it redirects to application page and there it shows correct application state = Failed.
The App status is not getting updated on Yarn All Application page.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org