You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Chandni Singh (JIRA)" <ji...@apache.org> on 2018/11/06 21:50:00 UTC

[jira] [Comment Edited] (YARN-8672) TestContainerManager#testLocalingResourceWhileContainerRunning occasionally times out

    [ https://issues.apache.org/jira/browse/YARN-8672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16677311#comment-16677311 ] 

Chandni Singh edited comment on YARN-8672 at 11/6/18 9:49 PM:
--------------------------------------------------------------

[~eyang] Please see below:

In DefaultContainerExecutor.startLocalizer(LocalizerStartContext ctx), the token file path is read from start context and then written to {{appStorageDir/<containerId.token>}}. The {{appStorageDir}} is then set as the working directory for {{ContainerLocalizer}}. This is the file which is being read in {{runLocalization}} so patch 005 is not going to break that method.
{code:java}
    Path nmPrivateContainerTokensPath = ctx.getNmPrivateContainerTokens();

String tokenFn =
    String.format(ContainerLocalizer.TOKEN_FILE_NAME_FMT, locId);
Path tokenDst = new Path(appStorageDir, tokenFn);
copyFile(nmPrivateContainerTokensPath, tokenDst, user);
LOG.info("Copying from " + nmPrivateContainerTokensPath
    + " to " + tokenDst);
...
  localizerFc.setWorkingDirectory(appStorageDir);
{code}
In the LinuxContainerExecutor.startLocalizer(LocalizerStartContext ctx), the token file path is appended to the arguments.
{code:java}
    initializeContainerOp.appendArgs(
        runAsUser,
        user,
        Integer.toString(
            PrivilegedOperation.RunAsUserCommand.INITIALIZE_CONTAINER
                .getValue()),
        appId,
        locId,
        nmPrivateContainerTokensPath.toUri().getPath().toString(),
        StringUtils.join(PrivilegedOperation.LINUX_FILE_PATH_SEPARATOR,
            localDirs),
        StringUtils.join(PrivilegedOperation.LINUX_FILE_PATH_SEPARATOR,
            logDirs));
{code}
I assumed this will be copied to the working directory with the same name <containerId.token> (just like DefaultContainerExecutor) when the privilege operation is executed.

The {{ContainerLocalizer.run}} method does assume that token file is in the current working directory.
{code:java}
    Path tokenPath =
          new Path(String.format(TOKEN_FILE_NAME_FMT, localizerId));
      credFile = lfs.open(tokenPath);
      creds.readTokenStorageStream(credFile);
{code}
cc [~jlowe] [~shanekumpf@gmail.com]


was (Author: csingh):
[~eyang] Please see below:

In DefaultContainerExecutor.startLocalizer(LocalizerStartContext ctx), the token file path is read from start context and then written to {{appStorageDir/<containerId.token>}}. The {{appStorageDir}} is then set as the working directory for {{ContainerLocalizer}}. This is the file which is being read in {{runLocalization}} so patch 005 is not going to break that method.
{code:java}
    Path nmPrivateContainerTokensPath = ctx.getNmPrivateContainerTokens();

String tokenFn =
    String.format(ContainerLocalizer.TOKEN_FILE_NAME_FMT, locId);
Path tokenDst = new Path(appStorageDir, tokenFn);
copyFile(nmPrivateContainerTokensPath, tokenDst, user);
LOG.info("Copying from " + nmPrivateContainerTokensPath
    + " to " + tokenDst);
...
  localizerFc.setWorkingDirectory(appStorageDir);
{code}

In the LinuxContainerExecutor.startLocalizer(LocalizerStartContext ctx),  the token file path is appended to the arguments.
{code}
    initializeContainerOp.appendArgs(
        runAsUser,
        user,
        Integer.toString(
            PrivilegedOperation.RunAsUserCommand.INITIALIZE_CONTAINER
                .getValue()),
        appId,
        locId,
        nmPrivateContainerTokensPath.toUri().getPath().toString(),
        StringUtils.join(PrivilegedOperation.LINUX_FILE_PATH_SEPARATOR,
            localDirs),
        StringUtils.join(PrivilegedOperation.LINUX_FILE_PATH_SEPARATOR,
            logDirs));
{code}
I assumed this will be copied to the working directory when the privilege operation is executed.

The {{ContainerLocalizer.run}} method does assume that token file is in the current working directory.
{code}
    Path tokenPath =
          new Path(String.format(TOKEN_FILE_NAME_FMT, localizerId));
      credFile = lfs.open(tokenPath);
      creds.readTokenStorageStream(credFile);
{code}

cc [~jlowe] [~shanekumpf@gmail.com]

> TestContainerManager#testLocalingResourceWhileContainerRunning occasionally times out
> -------------------------------------------------------------------------------------
>
>                 Key: YARN-8672
>                 URL: https://issues.apache.org/jira/browse/YARN-8672
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 3.2.0
>            Reporter: Jason Lowe
>            Assignee: Chandni Singh
>            Priority: Major
>         Attachments: YARN-8672.001.patch, YARN-8672.002.patch, YARN-8672.003.patch, YARN-8672.004.patch, YARN-8672.005.patch
>
>
> Precommit builds have been failing in TestContainerManager#testLocalingResourceWhileContainerRunning.  I have been able to reproduce the problem without any patch applied if I run the test enough times.  It looks like something is removing container tokens from the nmPrivate area just as a new localizer starts.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org