You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Chandni Singh (JIRA)" <ji...@apache.org> on 2018/10/23 21:10:00 UTC

[jira] [Comment Edited] (YARN-8672) TestContainerManager#testLocalingResourceWhileContainerRunning occasionally times out

    [ https://issues.apache.org/jira/browse/YARN-8672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16661291#comment-16661291 ] 

Chandni Singh edited comment on YARN-8672 at 10/23/18 9:09 PM:
---------------------------------------------------------------

On a second thought, if a container is killed while localizing, we can make the change  to not cancel the localizer.

Please see patch 3. This will address leaking localizer's token files when container gets killed. 

IMO this is just a temporary solution to fix the test case and is safer. I can create a followup Jira for it is okay?

{quote}
It seems odd to me that we create localizer tokens _and_ container tokens. Seems like we only need one of these, and the container tokens have the benefit of getting automatically cleaned up as part of removing container directories. If for some reason we have to keep them separate then we could change the localization path to be under the same nmPrivate directory used for the container so we don't have to be so careful about removing these things as part of cleaning up localizers – it will be cleaned up automatically as part of cleaning the container.
{quote}
 


was (Author: csingh):
On a second thought, if a container is killed while localizing, we can make the change  to not cancel the localizer.

Please see patch 3. This will address leaking localizer's token files when container gets killed. 

IMO this is just a temporary solution to fix the test case and is safer. I can create a followup Jira for if this is okay:

{quote}
It seems odd to me that we create localizer tokens _and_ container tokens. Seems like we only need one of these, and the container tokens have the benefit of getting automatically cleaned up as part of removing container directories. If for some reason we have to keep them separate then we could change the localization path to be under the same nmPrivate directory used for the container so we don't have to be so careful about removing these things as part of cleaning up localizers – it will be cleaned up automatically as part of cleaning the container.
{quote}
 

> TestContainerManager#testLocalingResourceWhileContainerRunning occasionally times out
> -------------------------------------------------------------------------------------
>
>                 Key: YARN-8672
>                 URL: https://issues.apache.org/jira/browse/YARN-8672
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 3.2.0
>            Reporter: Jason Lowe
>            Assignee: Chandni Singh
>            Priority: Major
>         Attachments: YARN-8672.001.patch, YARN-8672.002.patch, YARN-8672.003.patch
>
>
> Precommit builds have been failing in TestContainerManager#testLocalingResourceWhileContainerRunning.  I have been able to reproduce the problem without any patch applied if I run the test enough times.  It looks like something is removing container tokens from the nmPrivate area just as a new localizer starts.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org