You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Bibin A Chundatt (JIRA)" <ji...@apache.org> on 2016/03/01 05:40:18 UTC
[jira] [Updated] (YARN-4744) Too many signal to container failure
in case of LCE
[ https://issues.apache.org/jira/browse/YARN-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Bibin A Chundatt updated YARN-4744:
-----------------------------------
Description:
Install HA cluster in secure mode
Enable LCE with cgroups
Start server with dsperf user
Submit application with user yarn
Too many signal to container failure
{noformat}
2014-03-01 14:10:32,223 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime: Using container runtime: DefaultLinuxContainerRuntime
2014-03-01 14:10:32,228 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor: Shell execution returned exit code: 9. Privileged Execution Operation Output:
main : command provided 2
main : run as user is yarn
main : requested yarn user is yarn
Full command array for failed execution:
[/opt/bibin/dsperf/HAINSTALL/install/hadoop/nodemanager/bin/container-executor, yarn, yarn, 2, 28575, 15]
2014-03-01 14:10:32,228 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime: Signal container failed. Exception:
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException: ExitCodeException exitCode=9:
at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:173)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime.signalContainer(DefaultLinuxContainerRuntime.java:132)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime.signalContainer(DelegatingLinuxContainerRuntime.java:109)
at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.signalContainer(LinuxContainerExecutor.java:513)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.cleanupContainer(ContainerLaunch.java:520)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:139)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:55)
at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184)
at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110)
at java.lang.Thread.run(Thread.java:745)
Caused by: ExitCodeException exitCode=9:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:927)
at org.apache.hadoop.util.Shell.run(Shell.java:838)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:150)
... 9 more
{noformat}
Checked the same scenario in 2.7.2 version (not available)
was:
Enable LCE with cgroups
Start server with dsperf user
Submit application with user yarn
Too many signal to container failure
{noformat}
2014-03-01 14:10:32,223 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime: Using container runtime: DefaultLinuxContainerRuntime
2014-03-01 14:10:32,228 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor: Shell execution returned exit code: 9. Privileged Execution Operation Output:
main : command provided 2
main : run as user is yarn
main : requested yarn user is yarn
Full command array for failed execution:
[/opt/bibin/dsperf/HAINSTALL/install/hadoop/nodemanager/bin/container-executor, yarn, yarn, 2, 28575, 15]
2014-03-01 14:10:32,228 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime: Signal container failed. Exception:
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException: ExitCodeException exitCode=9:
at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:173)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime.signalContainer(DefaultLinuxContainerRuntime.java:132)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime.signalContainer(DelegatingLinuxContainerRuntime.java:109)
at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.signalContainer(LinuxContainerExecutor.java:513)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.cleanupContainer(ContainerLaunch.java:520)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:139)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:55)
at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184)
at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110)
at java.lang.Thread.run(Thread.java:745)
Caused by: ExitCodeException exitCode=9:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:927)
at org.apache.hadoop.util.Shell.run(Shell.java:838)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:150)
... 9 more
{noformat}
Checked the same scenario in 2.7.2 version (not available)
> Too many signal to container failure in case of LCE
> ---------------------------------------------------
>
> Key: YARN-4744
> URL: https://issues.apache.org/jira/browse/YARN-4744
> Project: Hadoop YARN
> Issue Type: Bug
> Affects Versions: 2.9.0
> Reporter: Bibin A Chundatt
>
> Install HA cluster in secure mode
> Enable LCE with cgroups
> Start server with dsperf user
> Submit application with user yarn
> Too many signal to container failure
> {noformat}
> 2014-03-01 14:10:32,223 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime: Using container runtime: DefaultLinuxContainerRuntime
> 2014-03-01 14:10:32,228 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor: Shell execution returned exit code: 9. Privileged Execution Operation Output:
> main : command provided 2
> main : run as user is yarn
> main : requested yarn user is yarn
> Full command array for failed execution:
> [/opt/bibin/dsperf/HAINSTALL/install/hadoop/nodemanager/bin/container-executor, yarn, yarn, 2, 28575, 15]
> 2014-03-01 14:10:32,228 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime: Signal container failed. Exception:
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException: ExitCodeException exitCode=9:
> at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:173)
> at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime.signalContainer(DefaultLinuxContainerRuntime.java:132)
> at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime.signalContainer(DelegatingLinuxContainerRuntime.java:109)
> at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.signalContainer(LinuxContainerExecutor.java:513)
> at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.cleanupContainer(ContainerLaunch.java:520)
> at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:139)
> at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:55)
> at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184)
> at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: ExitCodeException exitCode=9:
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:927)
> at org.apache.hadoop.util.Shell.run(Shell.java:838)
> at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117)
> at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:150)
> ... 9 more
> {noformat}
> Checked the same scenario in 2.7.2 version (not available)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)