You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Andras Gyori (Jira)" <ji...@apache.org> on 2020/07/30 11:50:00 UTC

[jira] [Reopened] (YARN-4783) Log aggregation failure for application when Nodemanager is restarted

     [ https://issues.apache.org/jira/browse/YARN-4783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andras Gyori reopened YARN-4783:
--------------------------------

I am reopening this issue in order to find a less invasive approach on how to handle this corner case, since it was reported a long time ago and still has not been resolved yet.
Uploaded a new patch without a test case for now.

The main idea is to try to renew the token stored in the application credentials, on an application state transition from NEW to INITING. If the renewal process is successful, the token is valid and nothing needs to be done from the application's point of view. However, if the renewal is failed with InvalidToken error, we request a new one on behalf of the user.

In case of a token request, it is now the application's responsibility to clean it up, when the corresponding operations are done, therefore it is canceled when the log aggregation is finished.

> Log aggregation failure for application when Nodemanager is restarted 
> ----------------------------------------------------------------------
>
>                 Key: YARN-4783
>                 URL: https://issues.apache.org/jira/browse/YARN-4783
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.7.1
>            Reporter: Surendra Singh Lilhore
>            Assignee: Andras Gyori
>            Priority: Major
>
> Scenario :
> =========
> 1.Start NM with user dsperf:hadoop
> 2.Configure linux-execute user as dsperf
> 3.Submit application with yarn user 
> 4.Once few containers are allocated to NM 1
> 5.Nodemanager 1 is stopped  (wait for expiry )
> 6.Start node manager after application is completed
> 7.Check the log aggregation is happening for the containers log in NMLocal directory
> Expect Output :
> ===============
> Log aggregation should be succesfull
> Actual Output :
> ===============
> Log aggreation not successfull



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org