You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Shuai Zhang <sh...@microsoft.com> on 2016/04/22 11:58:35 UTC

[HELP] Failed to work with DockerContainerExecutor in Yarn

Hi there,

I’m trying to work with DockerContainerExecutor in Yarn, which is described in https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/DockerContainerExecutor.html

I followed the document and prepared everything, but failed to run even a simple wordcount job with DockerContainerExecutor.

The console shows the failure:

> Diagnostics: Exception from container-launch: 
> ExitCodeException exitCode=126: Error: No such image or container: container_1461317429669_0001_02_000001
> Usage of loopback devices is strongly discouraged for production use. Either use `--storage-opt dm.thinpooldev` or use `--storage-opt dm.no_warn_on_loop_devices=true` to suppress this warning.
> bash: /tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_1461317429669_0001/container_1461317429669_0001_02_000001/launch_container.sh: Permission denied
>
>	at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
> 	at org.apache.hadoop.util.Shell.run(Shell.java:456)
> ... 
> Container exited with a non-zero exit code 126
> Failing this attempt. Failing the application.


The log of nodemanager shows:

> 2016-04-22 17:31:40,976 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1461317429669_0001_01_000001 transitioned from LOCALIZED to RUNNING
> 2016-04-22 17:31:40,992 DEBUG org.apache.hadoop.yarn.server.nodemanager.DockerContainerExecutor: launchContainer: /usr/bin/docker run --rm --net=host  --name container_1461317429669_0001_01_000001 -v /tmp/hadoop-root/nm-local-dir:/tmp/hadoop-root/nm-local-dir -v /root/hadoop-2.7.2/logs/userlogs:/root/hadoop-2.7.2/logs/userlogs -v /tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_1461317429669_0001/container_1461317429669_0001_01_000001:/tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_1461317429669_0001/container_1461317429669_0001_01_000001 sequenceiq/hadoop-docker:2.7.1 bash /tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_1461317429669_0001/container_1461317429669_0001_01_000001/docker_container_executor.sh
> 2016-04-22 17:31:43,575 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Starting resource-monitoring for container_1461317429669_0001_01_000001
> 2016-04-22 17:31:45,832 WARN org.apache.hadoop.yarn.server.nodemanager.DockerContainerExecutor: Exit code from container container_1461317429669_0001_01_000001 is : 126
> 2016-04-22 17:31:45,834 WARN org.apache.hadoop.yarn.server.nodemanager.DockerContainerExecutor: Exception from container-launch with container ID: container_1461317429669_0001_01_000001 and exit code: 126
> ExitCodeException exitCode=126: Error: No such image or container: container_1461317429669_0001_01_000001
> Usage of loopback devices is strongly discouraged for production use. Either use `--storage-opt dm.thinpooldev` or use `--storage-opt dm.no_warn_on_loop_devices=true` to suppress this warning.
> bash: /tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_1461317429669_0001/container_1461317429669_0001_01_000001/launch_container.sh: Permission denied
>
>	at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
>	at org.apache.hadoop.util.Shell.run(Shell.java:456)
> ...
> 2016-04-22 17:31:45,836 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: 
> 2016-04-22 17:31:45,837 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Container exited with a non-zero exit code 126
> 2016-04-22 17:31:45,839 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1461317429669_0001_01_000001 transitioned from RUNNING to EXITED_WITH_FAILURE


I tried to run the launching command by manual, but got the same error: Permission denied. But it differs when I append the "--privileged" options.

> /usr/bin/docker run --privileged --rm --net=host  --name container_1461317429669_0001_01_000001 -v /tmp/hadoop-root/nm-local-dir:/tmp/hadoop-root/nm-local-dir -v /root/hadoop-2.7.2/logs/userlogs:/root/hadoop-2.7.2/logs/userlogs -v /tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_1461317429669_0001/container_1461317429669_0001_01_000001:/tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_1461317429669_0001/container_1461317429669_0001_01_000001 sequenceiq/hadoop-docker:2.7.1 bash /tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_1461317429669_0001/container_1461317429669_0001_01_000001/docker_container_executor.sh
> Usage of loopback devices is strongly discouraged for production use. Either use `--storage-opt dm.thinpooldev` or use `--storage-opt dm.no_warn_on_loop_devices=true` to suppress this warning.
> /tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_1461317429669_0001/container_1461317429669_0001_01_000001/docker_container_executor_session.sh: line 3: /usr/bin/docker: No such file or directory
> /tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_1461317429669_0001/container_1461317429669_0001_01_000001/docker_container_executor_session.sh: line 5: /usr/bin/docker: No such file or directory


Why the launching script need to run docker command within a docker container? Is there anybody can show me a way to run jobs with DockerContainerExecutor?

Environment:
* CentOS 7
* Hadoop 2.7.2
* Docker 1.9.1
* Docker Image: sequenceiq/hadoop-docker:2.7.1

Reproduce steps:
* Download & install CentOS7 minimal ISO with single root user: http://isoredirect.centos.org/centos/7/isos/x86_64/CentOS-7-x86_64-Minimal-1511.iso
* yum install docker java-1.7.0-openjdk
* Download Hadoop 2.7.2: http://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-2.7.2/hadoop-2.7.2.tar.gz
* Prepare configuration files from here: https://onedrive.live.com/redir?resid=F41C57B4F6E1B6C9!1618&authkey=!AEvln9poXczhSX4&ithint=file%2cgz
* Setup JAVA_HOME & HADOOP_PREFIX
* docker pull sequenceiq/hadoop-docker:2.7.1
* $HADOOP_PREFIX/bin/hdfs namenode -format
* $HADOOP_PREFIX/sbin/start-all.sh
* $HADOOP_PREFIX/bin/hdfs dfs -mkdir -p /user/root
* $HADOOP_PREFIX/bin/hdfs dfs -put $HADOOP_PREFIX/LICENSE.txt
* $HADOOP_PREFIX/bin/hadoop jar $HADOOP_PREFIX/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar wordcount -Dmapreduce.map.env="yarn.nodemanager.docker-container-executor.image-name=sequenceiq/hadoop-docker:2.7.1" -Dyarn.app.mapreduce.am.env="yarn.nodemanager.docker-container-executor.image-name=sequenceiq/hadoop-docker:2.7.1" LICENSE.txt wc_out

Additional info:
Related logs & configuration files & generated shell scripts are placed here: https://onedrive.live.com/redir?resid=F41C57B4F6E1B6C9!1621&authkey=!AGZttYYahRTc0KQ&ithint=folder%2cgz


Regards,
Shuai Zhang

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@hadoop.apache.org
For additional commands, e-mail: user-help@hadoop.apache.org