You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Naganarasimha G R (JIRA)" <ji...@apache.org> on 2015/02/03 11:55:34 UTC

[jira] [Commented] (YARN-3127) Apphistory url crashes when RM switches with ATS enabled

    [ https://issues.apache.org/jira/browse/YARN-3127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14303071#comment-14303071 ] 

Naganarasimha G R commented on YARN-3127:
-----------------------------------------

In {{org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.AttemptRecoveredTransition}}
{quote}
if (rmApp.getCurrentAppAttempt() == appAttempt
            && !RMAppImpl.isAppInFinalState(rmApp)) {
          // Add the previous finished attempt to scheduler synchronously so
          // that scheduler knows the previous attempt.
          appAttempt.scheduler.handle(new AppAttemptAddedSchedulerEvent(
            appAttempt.getAppAttemptId(), false, true));
          (new BaseFinalTransition(appAttempt.recoveredFinalState)).transition(
              appAttempt, event);
        }
{quote}
RMAppImpl.isAppInFinalState returns true hence the transition which publishes the attempt during recovery to ATS is not played. 
So one option is to move BaseFinalTransition.transition  outside this if block.
But other query which i have is, During recovery whether is it required to publish events to ATS ?



> Apphistory url crashes when RM switches with ATS enabled
> --------------------------------------------------------
>
>                 Key: YARN-3127
>                 URL: https://issues.apache.org/jira/browse/YARN-3127
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager, timelineserver
>    Affects Versions: 2.6.0
>         Environment: RM HA with ATS
>            Reporter: Bibin A Chundatt
>            Assignee: Naganarasimha G R
>
> 1.Start RM with HA and ATS configured and run some yarn applications
> 2.Once applications are finished sucessfully start timeline server
> 3.Now failover HA form active to standby
> 4.Access timeline server URL <IP>:<PORT>/applicationhistory
> Result: Application history URL fails with below info
> {quote}
> 2015-02-03 20:28:09,511 ERROR org.apache.hadoop.yarn.webapp.View: Failed to read the applications.
> java.lang.reflect.UndeclaredThrowableException
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1643)
> 	at org.apache.hadoop.yarn.server.webapp.AppsBlock.render(AppsBlock.java:80)
> 	at org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:67)
> 	at org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:77)
> 	at org.apache.hadoop.yarn.webapp.View.render(View.java:235)
> 	at org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49)
> 	...
> Caused by: org.apache.hadoop.yarn.exceptions.ApplicationAttemptNotFoundException: The entity for application attempt appattempt_1422972608379_0001_000001 doesn't exist in the timeline store
> 	at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.getApplicationAttempt(ApplicationHistoryManagerOnTimelineStore.java:151)
> 	at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.generateApplicationReport(ApplicationHistoryManagerOnTimelineStore.java:499)
> 	at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.getAllApplications(ApplicationHistoryManagerOnTimelineStore.java:108)
> 	at org.apache.hadoop.yarn.server.webapp.AppsBlock$1.run(AppsBlock.java:84)
> 	at org.apache.hadoop.yarn.server.webapp.AppsBlock$1.run(AppsBlock.java:81)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> 	... 51 more
> 2015-02-03 20:28:09,512 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error handling URI: /applicationhistory
> org.apache.hadoop.yarn.webapp.WebAppException: Error rendering block: nestLevel=6 expected 5
> 	at org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69)
> 	at org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:77)
> {quote}
> Behaviour with AHS with file based history store
> 	-Apphistory url is working 
> 	-No attempt entries are shown for each application.
> 	
> Based on inital analysis when RM switches ,application attempts from state store  are not replayed but only applications are.
> So when /applicaitonhistory url is accessed it tries for all attempt id and fails



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)