You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Zhijie Shen (JIRA)" <ji...@apache.org> on 2014/07/22 09:41:39 UTC

[jira] [Commented] (YARN-2330) Jobs are not displaying in timeline server after RM restart

    [ https://issues.apache.org/jira/browse/YARN-2330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14069943#comment-14069943 ] 

Zhijie Shen commented on YARN-2330:
-----------------------------------

It is very likely that the history file of the given application is still not closed for writing (for example, after RM restarting, RM reopen the history file to append the history information). On the other side, the reader want to scan the file under writing.

The following logic is broken, because writer is invoked on RM, while reader is invoked on timeline server. Hence, from the point of view of reader. outstandingWriters is always empty. This cannot be used to indicate whether a file was opened for writing or not,
{code}
    // The history file is still under writing
    if (outstandingWriters.containsKey(appId)) {
      throw new IOException("History file for application " + appId
          + " is under writing");
    }
{code}

> Jobs are not displaying in timeline server after RM restart
> -----------------------------------------------------------
>
>                 Key: YARN-2330
>                 URL: https://issues.apache.org/jira/browse/YARN-2330
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>    Affects Versions: 2.4.1
>         Environment: Nodemanagers 3 (3*8GB)
> Queues A = 70%
> Queues B = 30%
>            Reporter: Nishan Shetty
>
> Submit jobs to queue a
> While job is running Restart RM 
> Observe that those jobs are not displayed in timelineserver
> {code}
> 2014-07-22 10:11:32,084 ERROR org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore: History information of application application_1406002968974_0003 is not included into the result due to the exception
> java.io.IOException: Cannot seek to negative offset
> 	at org.apache.hadoop.hdfs.DFSInputStream.seek(DFSInputStream.java:1381)
> 	at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:63)
> 	at org.apache.hadoop.io.file.tfile.BCFile$Reader.<init>(BCFile.java:624)
> 	at org.apache.hadoop.io.file.tfile.TFile$Reader.<init>(TFile.java:804)
> 	at org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore$HistoryFileReader.<init>(FileSystemApplicationHistoryStore.java:683)
> 	at org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore.getHistoryFileReader(FileSystemApplicationHistoryStore.java:661)
> 	at org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore.getApplication(FileSystemApplicationHistoryStore.java:146)
> 	at org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore.getAllApplications(FileSystemApplicationHistoryStore.java:199)
> 	at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.getAllApplications(ApplicationHistoryManagerImpl.java:103)
> 	at org.apache.hadoop.yarn.server.webapp.AppsBlock.render(AppsBlock.java:75)
> 	at org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:66)
> 	at org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:76)
> 	at org.apache.hadoop.yarn.webapp.View.render(View.java:235)
> 	at org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49)
> 	at org.apache.hadoop.yarn.webapp.hamlet.HamletImpl$EImp._v(HamletImpl.java:117)
> 	at org.apache.hadoop.yarn.webapp.hamlet.Hamlet$TD._(Hamlet.java:845)
> 	at org.apache.hadoop.yarn.webapp.view.TwoColumnLayout.render(TwoColumnLayout.java:56)
> 	at org.apache.hadoop.yarn.webapp.view.HtmlPage.render(HtmlPage.java:82)
> 	at org.apache.hadoop.yarn.webapp.Dispatcher.render(Dispatcher.java:197)
> 	at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:156)
> 	at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
> 	at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263)
> 	at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178)
> 	at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91)
> 	at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:62)
> 	at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:900)
> 	at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834)
> 	at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:795)
> 	at com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163)
> 	at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58)
> 	at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118)
> 	at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113)
> 	at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
> 	at org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:109)
> 	at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
> 	at org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1192)
> 	at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
> 	at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
> 	at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)