You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Chandni Singh (JIRA)" <ji...@apache.org> on 2018/11/06 21:50:00 UTC
[jira] [Comment Edited] (YARN-8672)
TestContainerManager#testLocalingResourceWhileContainerRunning occasionally
times out
[ https://issues.apache.org/jira/browse/YARN-8672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16677311#comment-16677311 ]
Chandni Singh edited comment on YARN-8672 at 11/6/18 9:49 PM:
--------------------------------------------------------------
[~eyang] Please see below:
In DefaultContainerExecutor.startLocalizer(LocalizerStartContext ctx), the token file path is read from start context and then written to {{appStorageDir/<containerId.token>}}. The {{appStorageDir}} is then set as the working directory for {{ContainerLocalizer}}. This is the file which is being read in {{runLocalization}} so patch 005 is not going to break that method.
{code:java}
Path nmPrivateContainerTokensPath = ctx.getNmPrivateContainerTokens();
String tokenFn =
String.format(ContainerLocalizer.TOKEN_FILE_NAME_FMT, locId);
Path tokenDst = new Path(appStorageDir, tokenFn);
copyFile(nmPrivateContainerTokensPath, tokenDst, user);
LOG.info("Copying from " + nmPrivateContainerTokensPath
+ " to " + tokenDst);
...
localizerFc.setWorkingDirectory(appStorageDir);
{code}
In the LinuxContainerExecutor.startLocalizer(LocalizerStartContext ctx), the token file path is appended to the arguments.
{code:java}
initializeContainerOp.appendArgs(
runAsUser,
user,
Integer.toString(
PrivilegedOperation.RunAsUserCommand.INITIALIZE_CONTAINER
.getValue()),
appId,
locId,
nmPrivateContainerTokensPath.toUri().getPath().toString(),
StringUtils.join(PrivilegedOperation.LINUX_FILE_PATH_SEPARATOR,
localDirs),
StringUtils.join(PrivilegedOperation.LINUX_FILE_PATH_SEPARATOR,
logDirs));
{code}
I assumed this will be copied to the working directory with the same name <containerId.token> (just like DefaultContainerExecutor) when the privilege operation is executed.
The {{ContainerLocalizer.run}} method does assume that token file is in the current working directory.
{code:java}
Path tokenPath =
new Path(String.format(TOKEN_FILE_NAME_FMT, localizerId));
credFile = lfs.open(tokenPath);
creds.readTokenStorageStream(credFile);
{code}
cc [~jlowe] [~shanekumpf@gmail.com]
was (Author: csingh):
[~eyang] Please see below:
In DefaultContainerExecutor.startLocalizer(LocalizerStartContext ctx), the token file path is read from start context and then written to {{appStorageDir/<containerId.token>}}. The {{appStorageDir}} is then set as the working directory for {{ContainerLocalizer}}. This is the file which is being read in {{runLocalization}} so patch 005 is not going to break that method.
{code:java}
Path nmPrivateContainerTokensPath = ctx.getNmPrivateContainerTokens();
String tokenFn =
String.format(ContainerLocalizer.TOKEN_FILE_NAME_FMT, locId);
Path tokenDst = new Path(appStorageDir, tokenFn);
copyFile(nmPrivateContainerTokensPath, tokenDst, user);
LOG.info("Copying from " + nmPrivateContainerTokensPath
+ " to " + tokenDst);
...
localizerFc.setWorkingDirectory(appStorageDir);
{code}
In the LinuxContainerExecutor.startLocalizer(LocalizerStartContext ctx), the token file path is appended to the arguments.
{code}
initializeContainerOp.appendArgs(
runAsUser,
user,
Integer.toString(
PrivilegedOperation.RunAsUserCommand.INITIALIZE_CONTAINER
.getValue()),
appId,
locId,
nmPrivateContainerTokensPath.toUri().getPath().toString(),
StringUtils.join(PrivilegedOperation.LINUX_FILE_PATH_SEPARATOR,
localDirs),
StringUtils.join(PrivilegedOperation.LINUX_FILE_PATH_SEPARATOR,
logDirs));
{code}
I assumed this will be copied to the working directory when the privilege operation is executed.
The {{ContainerLocalizer.run}} method does assume that token file is in the current working directory.
{code}
Path tokenPath =
new Path(String.format(TOKEN_FILE_NAME_FMT, localizerId));
credFile = lfs.open(tokenPath);
creds.readTokenStorageStream(credFile);
{code}
cc [~jlowe] [~shanekumpf@gmail.com]
> TestContainerManager#testLocalingResourceWhileContainerRunning occasionally times out
> -------------------------------------------------------------------------------------
>
> Key: YARN-8672
> URL: https://issues.apache.org/jira/browse/YARN-8672
> Project: Hadoop YARN
> Issue Type: Bug
> Components: nodemanager
> Affects Versions: 3.2.0
> Reporter: Jason Lowe
> Assignee: Chandni Singh
> Priority: Major
> Attachments: YARN-8672.001.patch, YARN-8672.002.patch, YARN-8672.003.patch, YARN-8672.004.patch, YARN-8672.005.patch
>
>
> Precommit builds have been failing in TestContainerManager#testLocalingResourceWhileContainerRunning. I have been able to reproduce the problem without any patch applied if I run the test enough times. It looks like something is removing container tokens from the nmPrivate area just as a new localizer starts.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org