You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Joseph Wu (JIRA)" <ji...@apache.org> on 2017/02/02 20:22:51 UTC

[jira] [Updated] (MESOS-7027) CommandExecutor ENV overwritten by Docker Image ENV in Unified Containerizer

     [ https://issues.apache.org/jira/browse/MESOS-7027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joseph Wu updated MESOS-7027:
-----------------------------
              Sprint: Mesosphere Sprint 50
        Story Points: 5
    Target Version/s: 1.3.0  (was: 1.2.0)
            Priority: Critical  (was: Blocker)

This is technically a regression, as the bug was introduced in this commit (1.2.x): https://github.com/apache/mesos/commit/e3562845d7a491916df99b1b2c87b8acb4ebab3f

But I'm marking this as a non-blocker because:

* This bug is only encountered in specific installations/configurations of Mesos; particularly when the {{rpath}} of the mesos binaries does not match the installation path; or when {{LD_LIBRARY_PATH}} is explicitly specified on the agent.
** Workaround for command executor is to launch a single-task {{TaskGroup}}
* The fix may require some API fixes/changes, which may take a while to review.


> CommandExecutor ENV overwritten by Docker Image ENV in Unified Containerizer
> ----------------------------------------------------------------------------
>
>                 Key: MESOS-7027
>                 URL: https://issues.apache.org/jira/browse/MESOS-7027
>             Project: Mesos
>          Issue Type: Bug
>            Reporter: Kevin Klues
>            Assignee: Joseph Wu
>            Priority: Critical
>              Labels: environment, mesosphere
>
> Using the unified containerizer, if a docker image is provisioned and has environment variables set via the ENV directive, those environment variables will be inherited by the {{mesos-executor}} process and overwrite similarly named environment variables that otherwise would have been inherited from the agent.
> This causes problems (for example) in DC/OS when trying to launch tasks based off the {{nvidia/cuda}} image. The {{nvidia/cuda}} image explicitly sets {{LD_LIBRARY_PATH}} to its own value so that the proper nvidia libraries will be available to whatever command is launched inside the container.
> However, DC/OS relies on {{LD_LIBRARY_PATH}} to contain a path to {{/opt/mesosphere/lib}} so that all of the mesosphere libraries are available to the mesos binaries launched by the agent ({{mesos-containerizer}}, {{mesos-execute}}, etc.). This is necessary to make sure that any external dependencies they might have (e.g. libssl.so) can be resolved at runtime.
> By overwriting the executor's environment with the Docker Image environment, {{LD_LIBRARY_PATH}} will not be set properly and {{mesos-execute}} will fail.
> It seems to me, the Docker Image environment should *only* actually overwrite the environment of the user process (not its executor). However, this can get complicated, because the executor actually is the user process in the case of launching a custom executor.
> We need to rethink how the environment is inherited/overwritten through all the various processes that get spawned while launching a container as well as how to make it work for tasks launched by arbitrary executors.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)