You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Qi Zhu (Jira)" <ji...@apache.org> on 2021/07/17 02:43:00 UTC

[jira] [Comment Edited] (YARN-10855) yarn logs cli fails to retrieve logs if any TFile is corrupt or empty

    [ https://issues.apache.org/jira/browse/YARN-10855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17382403#comment-17382403 ] 

Qi Zhu edited comment on YARN-10855 at 7/17/21, 2:42 AM:
---------------------------------------------------------

Thanks [~Jim_Brennan] for update.

cc [~epayne]

If no other comments, i will commit it.


was (Author: zhuqi):
Thanks [~Jim_Brennan] for update.

If no other comments, i will commit it.

> yarn logs cli fails to retrieve logs if any TFile is corrupt or empty
> ---------------------------------------------------------------------
>
>                 Key: YARN-10855
>                 URL: https://issues.apache.org/jira/browse/YARN-10855
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: yarn
>    Affects Versions: 3.2.2, 2.10.1, 3.4.0, 3.3.1
>            Reporter: Jim Brennan
>            Assignee: Jim Brennan
>            Priority: Major
>         Attachments: YARN-10855.001.patch, YARN-10855.002.patch, YARN-10855.003.patch
>
>
> When attempting to retrieve yarn logs via the CLI command, it failed with the following stack trace (on branch-2.10):
> {noformat}
> yarn logs -applicationId application_1591017890475_1049740 > logs
> 20/06/05 19:15:50 INFO client.RMProxy: Connecting to ResourceManager 
> 20/06/05 19:15:51 INFO client.AHSProxy: Connecting to Application History server 
> Exception in thread "main" java.io.EOFException: Cannot seek to negative offset
> 	at org.apache.hadoop.hdfs.DFSInputStream.seek(DFSInputStream.java:1701)
> 	at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:65)
> 	at org.apache.hadoop.io.file.tfile.BCFile$Reader.<init>(BCFile.java:624)
> 	at org.apache.hadoop.io.file.tfile.TFile$Reader.<init>(TFile.java:804)
> 	at org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.<init>(AggregatedLogFormat.java:503)
> 	at org.apache.hadoop.yarn.logaggregation.LogCLIHelpers.dumpAllContainersLogs(LogCLIHelpers.java:227)
> 	at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:333)
> 	at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:367) 
> {noformat}
> The problem was that there was a zero-length TFile for one of the containers in the application aggregated log directory in hdfs.  When we removed the zero length file, {{yarn logs}} was able to retrieve the logs.
> A corrupt or zero length TFile for one container should not prevent loading logs for the rest of the application.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org