You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Dan Adkins (JIRA)" <ji...@apache.org> on 2018/10/19 18:22:00 UTC

[jira] [Created] (MESOS-9335) LIBPROCESS_ADVERTIES_IP is not passed to mesos-docker-executor

Dan Adkins created MESOS-9335:
---------------------------------

             Summary: LIBPROCESS_ADVERTIES_IP is not passed  to mesos-docker-executor
                 Key: MESOS-9335
                 URL: https://issues.apache.org/jira/browse/MESOS-9335
             Project: Mesos
          Issue Type: Bug
          Components: executor
    Affects Versions: 1.7.0
         Environment: Linux ip-10-33-15-130 4.4.0-1069-aws #79-Ubuntu SMP Mon Sep 24 15:01:41 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Mesos 1.7.0
            Reporter: Dan Adkins


I noticed that when I set both LIBPROCESS_IP and LIBPROCESS_ADVERTISE_IP for my mesos-slave, only LIBPROCESS_IP gets propagated to mesos-docker-executor. I noticed this because I have to set them both to avoid a hostname lookup, which doesn't work in my environment. LIBPROCESS_IP is set to 0.0.0.0, so that the slave will bind to any IP adrdess (and still be reachable locally at port 5051 for metrics gathering), while LIBPROCESS_ADVERTISE_IP is set to my externally reachable IP address so the rest of the cluster can talk to it. Lo and behold, with this setup, my slave executor processes were failing with the dreaded hostname lookup.

I notice there is code to inject LIBPROCESS_IP into the executor environment, but not mention of LIBPROCESS_ADVERTISE_IP.

[https://github.com/apache/mesos/blob/master/src/slave/slave.cpp#L9974-L9983]

Here's the command line and environment for my slave:

LIBPROCESS_IP=0.0.0.0

MASTER=zk://10.33.13.250:2181,10.33.9.108:2181,10.33.7.6:2181/mesos

LC_ALL=en_US.UTF-8

LOGS=/var/log/mesos

LIBPROCESS_ADVERTISE_IP=10.33.15.130

PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

PWD=/

LANG=en_US.UTF-8

SHLVL=0

ULIMIT=-n 8192

/usr/sbin/mesos-slave --master=zk://10.33.13.250:2181,10.33.9.108:2181,10.33.7.6:2181/mesos --log_dir=/var/log/mesos --containerizers=docker,mesos --executor_registration_timeout=5mins --work_dir=/mesos

And here's the command-line and environment for the executor process it attempted to run:

LIBPROCESS_IP=0.0.0.0

LIBPROCESS_PORT=0

MESOS_AGENT_ENDPOINT=10.33.15.130:5051

MESOS_CHECKPOINT=0

MESOS_DIRECTORY=/mesos/slaves/7c587a36-c4ed-48ce-bfa2-2b0d6e8274b2-S3864/frameworks/dummy_sleep-func-dadkins-d84e56b1a9/executors/dummy_sleep-func-dadkins-d84e56b1a9-func_0/runs/6b5adff6-c745-49ce-93c3-682bf7a23aca

MESOS_EXECUTOR_ID=dummy_sleep-func-dadkins-d84e56b1a9-func_0

MESOS_EXECUTOR_SHUTDOWN_GRACE_PERIOD=5secs

MESOS_FRAMEWORK_ID=dummy_sleep-func-dadkins-d84e56b1a9

MESOS_HTTP_COMMAND_EXECUTOR=0

MESOS_NATIVE_JAVA_LIBRARY=/usr/lib/libmesos-1.7.0.so

MESOS_NATIVE_LIBRARY=/usr/lib/libmesos-1.7.0.so

MESOS_SLAVE_ID=7c587a36-c4ed-48ce-bfa2-2b0d6e8274b2-S3864

MESOS_SLAVE_PID=slave(1)@10.33.15.130:5051

PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

mesos-docker-executor --cgroups_enable_cfs=false --container=mesos-6b5adff6-c745-49ce-93c3-682bf7a23aca–docker=docker --docker_socket=/var/run/docker.sock --help=false --initialize_driver_logging=true- -launcher_dir=/usr/libexec/mesos --logbufsecs=0 --logging_level=INFO --mapped_directory=/mnt/mesos/sandbox --quiet=false --sandbox_directory=/mesos/slaves/7c587a36-c4ed-48ce-bfa2-2b0d6e8274b2-S3864/frameworks/dummy_sleep-func-dadkins-d84e56b1a9/executors/dummy_sleep-func-dadkins-d84e56b1a9



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)