You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Peter Bacsko (JIRA)" <ji...@apache.org> on 2019/07/01 12:12:00 UTC

[jira] [Updated] (YARN-9660) Enhance documentation of Docker on YARN support

     [ https://issues.apache.org/jira/browse/YARN-9660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Peter Bacsko updated YARN-9660:
-------------------------------
    Description: 
Right now, using Docker on YARN has some hard requirements. If these requirements are not met, then launching the containers will fail and and error message will be printed. Depending on how familiar the user is with Docker, it might or might not be easy for them to understand what went wrong and how to fix the underlying problem.

It would be important to explicitly document these requirements along with the error messages.

*#1: CGroups handler cannot be systemd*

If docker deamon runs with systemd cgroups handler, we receive the following error upon launching a container:
{noformat}
Container id: container_1561638268473_0006_01_000002
Exit code: 7
Exception message: Launch container failed
Shell error output: /usr/bin/docker-current: Error response from daemon: cgroup-parent for systemd cgroup should be a valid slice named as "xxx.slice".
See '/usr/bin/docker-current run --help'.
Shell output: main : command provided 4
main : run as user is johndoe
main : requested yarn user is johndoe
{noformat}
Solution: switch to cgroupfs. Doing so can be OS-specific, but we can document a {{systemcl}} example.

 

*#2: {{/bin/bash}} must be present on the {{$PATH}} inside the container*

Some smaller images like "busybox" or "alpine" does not have {{/bin/bash}}. It's because all commands under {{/bin}} are linked to {{/bin/busybox}} and there's only {{/bin/sh}}.

If we try to use these kind of images, we'll see the following error message:
{noformat}
Container id: container_1561638268473_0015_01_000002
Exit code: 7
Exception message: Launch container failed
Shell error output: /usr/bin/docker-current: Error response from daemon: oci runtime error: container_linux.go:235: starting container process caused "exec: \"bash\": executable file not found in $PATH".
Shell output: main : command provided 4
main : run as user is johndoe
main : requested yarn user is johndoe
{noformat}
 

*#3: {{find}} command must be available on the {{$PATH}}*

It seems obvious that we have the {{find}} command, but even very popular images like {{fedora}} requires that we install it separately.

If we don't have {{find}} available, then {{launcher_container.sh}} fails with:
{noformat}
[2019-07-01 03:51:25.053]Container exited with a non-zero exit code 127. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
/tmp/hadoop-systest/nm-local-dir/usercache/systest/appcache/application_1561638268473_0017/container_1561638268473_0017_01_000002/launch_container.sh: line 44: find: command not found
Last 4096 bytes of stderr.txt :
[2019-07-01 03:51:25.053]Container exited with a non-zero exit code 127. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
/tmp/hadoop-systest/nm-local-dir/usercache/systest/appcache/application_1561638268473_0017/container_1561638268473_0017_01_000002/launch_container.sh: line 44: find: command not found
Last 4096 bytes of stderr.txt :
{noformat}

  was:
Right now, using Docker on YARN has some hard requirements. If these requirements are not met, then launching the containers will fail and and error message will be printed. Depending on how familiar the user is with Docker, it might or might not be easy for them to understand what went wrong and how to fix the underlying problem.

It would be important to explicitly document these requirements along with the error messages.

#1: CGroups handler cannot be systemd

If docker deamon runs with systemd cgroups handler, we receive the following error upon launching a container:

{noformat}
Container id: container_1561638268473_0006_01_000002
Exit code: 7
Exception message: Launch container failed
Shell error output: /usr/bin/docker-current: Error response from daemon: cgroup-parent for systemd cgroup should be a valid slice named as "xxx.slice".
See '/usr/bin/docker-current run --help'.
Shell output: main : command provided 4
main : run as user is johndoe
main : requested yarn user is johndoe
{noformat}

Solution: switch to cgroupfs. Doing so can be OS-specific, but we can document a {{systemcl}} example.


#2: {{/bin/bash}} must be present on the {{$PATH}} inside the container
Some smaller images like "busybox" or "alpine" does not have {{/bin/bash}}. It's because all commands under {{/bin}} are linked to {{/bin/busybox}} and there's only {{/bin/sh}}.

If we try to use these kind of images, we'll see the following error message:

{noformat}
Container id: container_1561638268473_0015_01_000002
Exit code: 7
Exception message: Launch container failed
Shell error output: /usr/bin/docker-current: Error response from daemon: oci runtime error: container_linux.go:235: starting container process caused "exec: \"bash\": executable file not found in $PATH".
Shell output: main : command provided 4
main : run as user is johndoe
main : requested yarn user is johndoe
{noformat}

#3: {{find}} command must be available on the {{$PATH}}

It seems obvious that we have the {{find}} command, but even very popular images like {{fedora}} requires that we install it separately.

If we don't have {{find}} available, then {{launcher_container.sh}} fails with:

{noformat}
2019-07-01 03:51:25.053]Container exited with a non-zero exit code 127. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
/tmp/hadoop-systest/nm-local-dir/usercache/systest/appcache/application_1561638268473_0017/container_1561638268473_0017_01_000002/launch_container.sh: line 44: find: command not found
Last 4096 bytes of stderr.txt :
[2019-07-01 03:51:25.053]Container exited with a non-zero exit code 127. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
/tmp/hadoop-systest/nm-local-dir/usercache/systest/appcache/application_1561638268473_0017/container_1561638268473_0017_01_000002/launch_container.sh: line 44: find: command not found
Last 4096 bytes of stderr.txt :
{noformat}


> Enhance documentation of Docker on YARN support
> -----------------------------------------------
>
>                 Key: YARN-9660
>                 URL: https://issues.apache.org/jira/browse/YARN-9660
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: documentation, nodemanager
>            Reporter: Peter Bacsko
>            Priority: Major
>
> Right now, using Docker on YARN has some hard requirements. If these requirements are not met, then launching the containers will fail and and error message will be printed. Depending on how familiar the user is with Docker, it might or might not be easy for them to understand what went wrong and how to fix the underlying problem.
> It would be important to explicitly document these requirements along with the error messages.
> *#1: CGroups handler cannot be systemd*
> If docker deamon runs with systemd cgroups handler, we receive the following error upon launching a container:
> {noformat}
> Container id: container_1561638268473_0006_01_000002
> Exit code: 7
> Exception message: Launch container failed
> Shell error output: /usr/bin/docker-current: Error response from daemon: cgroup-parent for systemd cgroup should be a valid slice named as "xxx.slice".
> See '/usr/bin/docker-current run --help'.
> Shell output: main : command provided 4
> main : run as user is johndoe
> main : requested yarn user is johndoe
> {noformat}
> Solution: switch to cgroupfs. Doing so can be OS-specific, but we can document a {{systemcl}} example.
>  
> *#2: {{/bin/bash}} must be present on the {{$PATH}} inside the container*
> Some smaller images like "busybox" or "alpine" does not have {{/bin/bash}}. It's because all commands under {{/bin}} are linked to {{/bin/busybox}} and there's only {{/bin/sh}}.
> If we try to use these kind of images, we'll see the following error message:
> {noformat}
> Container id: container_1561638268473_0015_01_000002
> Exit code: 7
> Exception message: Launch container failed
> Shell error output: /usr/bin/docker-current: Error response from daemon: oci runtime error: container_linux.go:235: starting container process caused "exec: \"bash\": executable file not found in $PATH".
> Shell output: main : command provided 4
> main : run as user is johndoe
> main : requested yarn user is johndoe
> {noformat}
>  
> *#3: {{find}} command must be available on the {{$PATH}}*
> It seems obvious that we have the {{find}} command, but even very popular images like {{fedora}} requires that we install it separately.
> If we don't have {{find}} available, then {{launcher_container.sh}} fails with:
> {noformat}
> [2019-07-01 03:51:25.053]Container exited with a non-zero exit code 127. Error file: prelaunch.err.
> Last 4096 bytes of prelaunch.err :
> /tmp/hadoop-systest/nm-local-dir/usercache/systest/appcache/application_1561638268473_0017/container_1561638268473_0017_01_000002/launch_container.sh: line 44: find: command not found
> Last 4096 bytes of stderr.txt :
> [2019-07-01 03:51:25.053]Container exited with a non-zero exit code 127. Error file: prelaunch.err.
> Last 4096 bytes of prelaunch.err :
> /tmp/hadoop-systest/nm-local-dir/usercache/systest/appcache/application_1561638268473_0017/container_1561638268473_0017_01_000002/launch_container.sh: line 44: find: command not found
> Last 4096 bytes of stderr.txt :
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org