You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Jorge Machado (JIRA)" <ji...@apache.org> on 2018/10/08 09:59:00 UTC

[jira] [Created] (MESOS-9299) Mesos Agent does not start if Zookeeper host does not exist

Jorge Machado created MESOS-9299:
------------------------------------

             Summary: Mesos Agent does not start if Zookeeper host does not exist
                 Key: MESOS-9299
                 URL: https://issues.apache.org/jira/browse/MESOS-9299
             Project: Mesos
          Issue Type: Bug
    Affects Versions: 1.8.0
         Environment: We are running on:
SUSE Linux Enterprise Server 12 (x86_64)
VERSION = 12
PATCHLEVEL = 2
# This file is deprecated and will be removed in a future service pack or release.
# Please check /etc/os-release for details about this release.
NAME="SLES"
VERSION="12-SP2"
VERSION_ID="12.2"
PRETTY_NAME="SUSE Linux Enterprise Server 12 SP2"
ID="sles"
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:suse:sles:12:sp2"
            Reporter: Jorge Machado


Hi Guys, 

we have the situation that wenn we remove a host from the zookeeper cluster mesos-agents will not start. I think this is wrong and the lib should just try the next host. 

Here are my logs: 

W1008 11:49:54.802669 21876 zookeeper.cpp:124] zookeeper_init failed: No such file or directory ; retrying in 1 second
2018-10-08 11:49:55,802:21847(0x7fc3d2e56700):ZOO_INFO@log_env@753: Client environment:zookeeper.version=zookeeper C client 3.4.8
2018-10-08 11:49:55,802:21847(0x7fc3d2e56700):ZOO_INFO@log_env@757: Client environment:host.name=SMTCAZ2002
2018-10-08 11:49:55,802:21847(0x7fc3d2e56700):ZOO_INFO@log_env@764: Client environment:os.name=Linux
2018-10-08 11:49:55,802:21847(0x7fc3d2e56700):ZOO_INFO@log_env@765: Client environment:os.arch=4.4.103-92.56-default
2018-10-08 11:49:55,802:21847(0x7fc3d2e56700):ZOO_INFO@log_env@766: Client environment:os.version=#1 SMP Wed Dec 27 16:24:31 UTC 2017 (2fd2155)
2018-10-08 11:49:55,803:21847(0x7fc3d2e56700):ZOO_INFO@log_env@774: Client environment:user.name=machjor
2018-10-08 11:49:55,803:21847(0x7fc3d2e56700):ZOO_INFO@log_env@782: Client environment:user.home=/root
2018-10-08 11:49:55,803:21847(0x7fc3d2e56700):ZOO_INFO@log_env@794: Client environment:user.dir=/home/machjor/mesos/build
2018-10-08 11:49:55,803:21847(0x7fc3d2e56700):ZOO_INFO@zookeeper_init@827: Initiating client connection, host=_HOST_LIST sessionTimeout=10000 watcher=0x7fc3e9a9b336 sessionId=0 sessionPasswd=<null> context=0x7fc3a4000be0 flags=0
2018-10-08 11:49:55,805:21847(0x7fc3d2e56700):ZOO_ERROR@getaddrs@640: getaddrinfo: No such file or directory

 

_HOST__LIST has one host that is invalid. 

 

to reproduce build from master from 08.10.2018 and run mesos-agent.sh --master=zk://somehostThatDoesNotExist,validHosts

 

The same issue is there for 1.3.2

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)