You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mesos.apache.org by 陈强 <ch...@qiyi.com> on 2016/05/16 06:38:45 UTC

No agents were shown/found in DC/OS cluster

Hi all,


I installed a DC/OS cluster, but found that no agents were shown/found 
in cluster after finishing GUI installer.
Other components are OK. Does someone met the issue? thanks.


Best Regards.
Chen, Qiang


Re: No agents were shown/found in DC/OS cluster

Posted by haosdent <ha...@gmail.com>.
+users@dcos.io

On Wed, May 18, 2016 at 11:27 AM, QiangChen <ch...@qiyi.com> wrote:

> I think I should get the root cause that the ip-detect provided in
> official may want to get the source ip (agent ip), but the script found
> wrong source ip but gateway ip. so bound to <slave>:5051 is our aim not
> <gw>:5051
>
> I fixed the script to adapt to my network env to the the source ip. and
> now the DC/OS works successfully now.
>
> The ip-detect script is updated as follows:
>
> ```
>  #!/usr/bin/env bash
>  set -o nounset -o errexit
>
>  MASTER_IP=10.221.82.185
>
>  #echo $(/usr/sbin/ip route show to match 10.221.82.185 | grep -Eo
> '[0-9]{1,3}.  [0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}' | tail -1)
>  echo $(/usr/sbin/ip -d route get 10.221.82.185 | egrep -o 'src ([0-9.]*)'
> | grep -o '[0-  9.]*')
>
> ```
>
> Thanks all for your help.
>
>
> On 2016年05月17日 23:18, Chengwei Yang wrote:
>
>> ABRT generally means that something crittical happened and you may not
>> found
>> that from stdout/stderr, so as journalctl.
>>
>> You may try to run ExecXXX in its ervice file from console manually to
>> see what
>> will happen and get some hints.
>>
>>
>


-- 
Best Regards,
Haosdent Huang

Re: No agents were shown/found in DC/OS cluster

Posted by QiangChen <ch...@qiyi.com>.
I think I should get the root cause that the ip-detect provided in 
official may want to get the source ip (agent ip), but the script found 
wrong source ip but gateway ip. so bound to <slave>:5051 is our aim not 
<gw>:5051

I fixed the script to adapt to my network env to the the source ip. and 
now the DC/OS works successfully now.

The ip-detect script is updated as follows:

```
  #!/usr/bin/env bash
  set -o nounset -o errexit

  MASTER_IP=10.221.82.185

  #echo $(/usr/sbin/ip route show to match 10.221.82.185 | grep -Eo 
'[0-9]{1,3}.  [0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}' | tail -1)
  echo $(/usr/sbin/ip -d route get 10.221.82.185 | egrep -o 'src 
([0-9.]*)' | grep -o '[0-  9.]*')

```

Thanks all for your help.

On 2016\u5e7405\u670817\u65e5 23:18, Chengwei Yang wrote:
> ABRT generally means that something crittical happened and you may not found
> that from stdout/stderr, so as journalctl.
>
> You may try to run ExecXXX in its ervice file from console manually to see what
> will happen and get some hints.
>


Re: No agents were shown/found in DC/OS cluster

Posted by Chengwei Yang <ch...@gmail.com>.
ABRT generally means that something crittical happened and you may not found
that from stdout/stderr, so as journalctl.

You may try to run ExecXXX in its ervice file from console manually to see what
will happen and get some hints.

-- 
Thanks,
Chengwei

On Mon, May 16, 2016 at 08:57:11AM -0700, Avinash Sridharan wrote:
> Looks like its the Agent that is getting killed :
>  Main PID: 8982 (code=killed, signal=ABRT)
> 
> 
> As Hoasodent mentioned, can you get the output of 
> "journalctl -u dcos-mesos-slave -b | tail -100"
> 
> You might be able to spot why the Agent exited.
> 
> On Mon, May 16, 2016 at 4:42 AM, haosdent <ha...@gmail.com> wrote:
> 
>     Hi, may we have your `journalctl` result in mesos agent? Or how about your
>     lauch mesos agent directly? 
> 
>     On Mon, May 16, 2016 at 7:27 PM, 陈强 <ch...@qiyi.com> wrote:
> 
>         yes, I checked they configured use the same mesos-dns in /etc/
>         resolv.conf and the mesos-dns is also running correctly. maybe it is
>         not the root cause ?
> 
> 
>         On 2016年05月16日 15:57, 陈强 wrote:
> 
>             Hi Stephen Gran,
> 
>             Yes, mesos-dns is running..
>             I use DC/OS GUI installer, where should I configure or check to use
>             the mesos-dns for slaves? thx.
> 
>             [root@chenqiang-worker-dev007-shgq ~]# systemctl status
>             dcos-mesos-dns -l
>             ● dcos-mesos-dns.service - Mesos DNS: DNS based Service Discovery
>                Loaded: loaded (/opt/mesosphere/packages/
>             mesos-dns--ee4c3c37c9be64426152a17a9aa094187357ae3b/
>             dcos.target.wants_master/dcos-mesos-dns.service; enabled; vendor
>             preset: disabled)
>                Active: active (running) since 一 2016-05-16 11:07:44 CST; 4h
>             45min ago
>              Main PID: 14718 (mesos-dns)
>                Memory: 7.0M
>                CGroup: /system.slice/dcos-mesos-dns.service
>                        └─14718 /opt/mesosphere/bin/mesos-dns --config=/opt/
>             mesosphere/etc/mesos-dns.json -logtostderr=true
> 
>             5月 16 11:07:44 chenqiang-worker-dev007-XXX systemd[1]: Started
>             Mesos DNS: DNS based Service Discovery.
>             5月 16 11:07:44 chenqiang-worker-dev007-XXX systemd[1]: Starting
>             Mesos DNS: DNS based Service Discovery...
>             5月 16 11:07:44 chenqiang-worker-dev007-XXX mesos-dns[14718]: 2016/
>             05/16 11:07:44 Connected to 127.0.0.1:2181
>             5月 16 11:07:44 chenqiang-worker-dev007-XXX mesos-dns[14718]: 2016/
>             05/16 11:07:44 Authenticated: id=95903287146905603, timeout=40000
> 
> 
> 
>             On 2016年05月16日 15:47, Stephen Gran wrote:
> 
>                 Hi,
> 
>                 The ExecStartPre failed - it looks like dns isn't working for
>                 some
>                 reason.  Can you check if mesos-dns is running and that the
>                 slaves are
>                 configured to use it?
> 
>                 Cheers,
> 
>                 On 16/05/16 08:36, 陈强 wrote:
> 
>                     It seems the mesos-slave service doesn't start...
> 
>                     [一 5月 16 15:30:42 root@chenqiang-worker-dev004-XXX ~]#
>                     systemctl
>                     status dcos-mesos-slave.service -l
>                     ● dcos-mesos-slave.service - Mesos Agent: DC/OS Mesos Agent
>                     Service
>                           Loaded: loaded
>                     (/opt/mesosphere/packages/
>                     mesos--0335ca0d3700ea88ad8b808f3b1b84d747ed07f0/
>                     dcos.target.wants_slave/dcos-mesos-slave.service;
>                     enabled; vendor preset: disabled)
>                           Active: activating (auto-restart) (Result: signal)
>                     since 一
>                     2016-05-16 15:30:42 CST; 1s ago
>                          Process: 8982
>                     ExecStart=/opt/mesosphere/packages/
>                     mesos--0335ca0d3700ea88ad8b808f3b1b84d747ed07f0/bin/
>                     mesos-slave
>                     (code=killed, signal=ABRT)
>                          Process: 8978 ExecStartPre=/bin/ping -c1 leader.mesos
>                     (code=exited,
>                     status=0/SUCCESS)
>                          Process: 8976 ExecStartPre=/bin/ping -c1 ready.spartan
>                     (code=exited,
>                     status=0/SUCCESS)
>                         Main PID: 8982 (code=killed, signal=ABRT)
> 
>                     5月 16 15:30:42 chenqiang-worker-dev004-XXX systemd[1]:
>                     Unit
>                     dcos-mesos-slave.service entered failed state.
>                     5月 16 15:30:42 chenqiang-worker-dev004-XXX systemd[1]:
>                     dcos-mesos-slave.service failed.
> 
>                     but can't find the root cause that make mesos-slave ABRT.
> 
>                     On 2016年05月16日 14:38, 陈强 wrote:
> 
>                         Hi all,
> 
> 
>                         I installed a DC/OS cluster, but found that no agents
>                         were shown/found
>                         in cluster after finishing GUI installer.
>                         Other components are OK. Does someone met the issue?
>                         thanks.
> 
> 
>                         Best Regards.
>                         Chen, Qiang
> 
> 
> 
> 
> 
> 
> 
>         --
>         陈强
> 
>         技术产品中心计算云弹性计算
>         手机:+86 15900964316
>         分机:8377
> 
>    
> 
> 
> 
>     --
>     Best Regards,
>     Haosdent Huang
> 
> 
> 
> 
> --
> Avinash Sridharan, Mesosphere
> +1 (323) 702 5245
> SECURITY NOTE: file ~/.netrc must not be accessible by others

Re: No agents were shown/found in DC/OS cluster

Posted by Avinash Sridharan <av...@mesosphere.io>.
Looks like its the Agent that is getting killed :
 Main PID: 8982 (code=killed, signal=ABRT)


As Hoasodent mentioned, can you get the output of
"journalctl -u dcos-mesos-slave -b | tail -100"

You might be able to spot why the Agent exited.

On Mon, May 16, 2016 at 4:42 AM, haosdent <ha...@gmail.com> wrote:

> Hi, may we have your `journalctl` result in mesos agent? Or how about your
> lauch mesos agent directly?
>
> On Mon, May 16, 2016 at 7:27 PM, 陈强 <ch...@qiyi.com> wrote:
>
>> yes, I checked they configured use the same mesos-dns in /etc/resolv.conf
>> and the mesos-dns is also running correctly. maybe it is not the root cause
>> ?
>>
>>
>> On 2016年05月16日 15:57, 陈强 wrote:
>>
>>> Hi Stephen Gran,
>>>
>>> Yes, mesos-dns is running..
>>> I use DC/OS GUI installer, where should I configure or check to use the
>>> mesos-dns for slaves? thx.
>>>
>>> [root@chenqiang-worker-dev007-shgq ~]# systemctl status dcos-mesos-dns
>>> -l
>>> ● dcos-mesos-dns.service - Mesos DNS: DNS based Service Discovery
>>>    Loaded: loaded
>>> (/opt/mesosphere/packages/mesos-dns--ee4c3c37c9be64426152a17a9aa094187357ae3b/dcos.target.wants_master/dcos-mesos-dns.service;
>>> enabled; vendor preset: disabled)
>>>    Active: active (running) since 一 2016-05-16 11:07:44 CST; 4h 45min ago
>>>  Main PID: 14718 (mesos-dns)
>>>    Memory: 7.0M
>>>    CGroup: /system.slice/dcos-mesos-dns.service
>>>            └─14718 /opt/mesosphere/bin/mesos-dns
>>> --config=/opt/mesosphere/etc/mesos-dns.json -logtostderr=true
>>>
>>> 5月 16 11:07:44 chenqiang-worker-dev007-XXX systemd[1]: Started Mesos
>>> DNS: DNS based Service Discovery.
>>> 5月 16 11:07:44 chenqiang-worker-dev007-XXX systemd[1]: Starting Mesos
>>> DNS: DNS based Service Discovery...
>>> 5月 16 11:07:44 chenqiang-worker-dev007-XXX mesos-dns[14718]: 2016/05/16
>>> 11:07:44 Connected to 127.0.0.1:2181
>>> 5月 16 11:07:44 chenqiang-worker-dev007-XXX mesos-dns[14718]: 2016/05/16
>>> 11:07:44 Authenticated: id=95903287146905603, timeout=40000
>>>
>>>
>>>
>>> On 2016年05月16日 15:47, Stephen Gran wrote:
>>>
>>>> Hi,
>>>>
>>>> The ExecStartPre failed - it looks like dns isn't working for some
>>>> reason.  Can you check if mesos-dns is running and that the slaves are
>>>> configured to use it?
>>>>
>>>> Cheers,
>>>>
>>>> On 16/05/16 08:36, 陈强 wrote:
>>>>
>>>>> It seems the mesos-slave service doesn't start...
>>>>>
>>>>> [一 5月 16 15:30:42 root@chenqiang-worker-dev004-XXX ~]# systemctl
>>>>> status dcos-mesos-slave.service -l
>>>>> ● dcos-mesos-slave.service - Mesos Agent: DC/OS Mesos Agent Service
>>>>>       Loaded: loaded
>>>>> (/opt/mesosphere/packages/mesos--0335ca0d3700ea88ad8b808f3b1b84d747ed07f0/dcos.target.wants_slave/dcos-mesos-slave.service;
>>>>>
>>>>> enabled; vendor preset: disabled)
>>>>>       Active: activating (auto-restart) (Result: signal) since 一
>>>>> 2016-05-16 15:30:42 CST; 1s ago
>>>>>      Process: 8982
>>>>> ExecStart=/opt/mesosphere/packages/mesos--0335ca0d3700ea88ad8b808f3b1b84d747ed07f0/bin/mesos-slave
>>>>>
>>>>> (code=killed, signal=ABRT)
>>>>>      Process: 8978 ExecStartPre=/bin/ping -c1 leader.mesos
>>>>> (code=exited,
>>>>> status=0/SUCCESS)
>>>>>      Process: 8976 ExecStartPre=/bin/ping -c1 ready.spartan
>>>>> (code=exited,
>>>>> status=0/SUCCESS)
>>>>>     Main PID: 8982 (code=killed, signal=ABRT)
>>>>>
>>>>> 5月 16 15:30:42 chenqiang-worker-dev004-XXX systemd[1]: Unit
>>>>> dcos-mesos-slave.service entered failed state.
>>>>> 5月 16 15:30:42 chenqiang-worker-dev004-XXX systemd[1]:
>>>>> dcos-mesos-slave.service failed.
>>>>>
>>>>> but can't find the root cause that make mesos-slave ABRT.
>>>>>
>>>>> On 2016年05月16日 14:38, 陈强 wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>>
>>>>>> I installed a DC/OS cluster, but found that no agents were shown/found
>>>>>> in cluster after finishing GUI installer.
>>>>>> Other components are OK. Does someone met the issue? thanks.
>>>>>>
>>>>>>
>>>>>> Best Regards.
>>>>>> Chen, Qiang
>>>>>>
>>>>>>
>>>>>
>>>
>> --
>> 陈强
>>
>> 技术产品中心 计算云 弹性计算
>> 手机:+86 15900964316
>> 分机:8377
>>
>>
>
>
> --
> Best Regards,
> Haosdent Huang
>



-- 
Avinash Sridharan, Mesosphere
+1 (323) 702 5245

Re: No agents were shown/found in DC/OS cluster

Posted by haosdent <ha...@gmail.com>.
Hi, may we have your `journalctl` result in mesos agent? Or how about your
lauch mesos agent directly?

On Mon, May 16, 2016 at 7:27 PM, 陈强 <ch...@qiyi.com> wrote:

> yes, I checked they configured use the same mesos-dns in /etc/resolv.conf
> and the mesos-dns is also running correctly. maybe it is not the root cause
> ?
>
>
> On 2016年05月16日 15:57, 陈强 wrote:
>
>> Hi Stephen Gran,
>>
>> Yes, mesos-dns is running..
>> I use DC/OS GUI installer, where should I configure or check to use the
>> mesos-dns for slaves? thx.
>>
>> [root@chenqiang-worker-dev007-shgq ~]# systemctl status dcos-mesos-dns -l
>> ● dcos-mesos-dns.service - Mesos DNS: DNS based Service Discovery
>>    Loaded: loaded
>> (/opt/mesosphere/packages/mesos-dns--ee4c3c37c9be64426152a17a9aa094187357ae3b/dcos.target.wants_master/dcos-mesos-dns.service;
>> enabled; vendor preset: disabled)
>>    Active: active (running) since 一 2016-05-16 11:07:44 CST; 4h 45min ago
>>  Main PID: 14718 (mesos-dns)
>>    Memory: 7.0M
>>    CGroup: /system.slice/dcos-mesos-dns.service
>>            └─14718 /opt/mesosphere/bin/mesos-dns
>> --config=/opt/mesosphere/etc/mesos-dns.json -logtostderr=true
>>
>> 5月 16 11:07:44 chenqiang-worker-dev007-XXX systemd[1]: Started Mesos DNS:
>> DNS based Service Discovery.
>> 5月 16 11:07:44 chenqiang-worker-dev007-XXX systemd[1]: Starting Mesos
>> DNS: DNS based Service Discovery...
>> 5月 16 11:07:44 chenqiang-worker-dev007-XXX mesos-dns[14718]: 2016/05/16
>> 11:07:44 Connected to 127.0.0.1:2181
>> 5月 16 11:07:44 chenqiang-worker-dev007-XXX mesos-dns[14718]: 2016/05/16
>> 11:07:44 Authenticated: id=95903287146905603, timeout=40000
>>
>>
>>
>> On 2016年05月16日 15:47, Stephen Gran wrote:
>>
>>> Hi,
>>>
>>> The ExecStartPre failed - it looks like dns isn't working for some
>>> reason.  Can you check if mesos-dns is running and that the slaves are
>>> configured to use it?
>>>
>>> Cheers,
>>>
>>> On 16/05/16 08:36, 陈强 wrote:
>>>
>>>> It seems the mesos-slave service doesn't start...
>>>>
>>>> [一 5月 16 15:30:42 root@chenqiang-worker-dev004-XXX ~]# systemctl
>>>> status dcos-mesos-slave.service -l
>>>> ● dcos-mesos-slave.service - Mesos Agent: DC/OS Mesos Agent Service
>>>>       Loaded: loaded
>>>> (/opt/mesosphere/packages/mesos--0335ca0d3700ea88ad8b808f3b1b84d747ed07f0/dcos.target.wants_slave/dcos-mesos-slave.service;
>>>>
>>>> enabled; vendor preset: disabled)
>>>>       Active: activating (auto-restart) (Result: signal) since 一
>>>> 2016-05-16 15:30:42 CST; 1s ago
>>>>      Process: 8982
>>>> ExecStart=/opt/mesosphere/packages/mesos--0335ca0d3700ea88ad8b808f3b1b84d747ed07f0/bin/mesos-slave
>>>>
>>>> (code=killed, signal=ABRT)
>>>>      Process: 8978 ExecStartPre=/bin/ping -c1 leader.mesos (code=exited,
>>>> status=0/SUCCESS)
>>>>      Process: 8976 ExecStartPre=/bin/ping -c1 ready.spartan
>>>> (code=exited,
>>>> status=0/SUCCESS)
>>>>     Main PID: 8982 (code=killed, signal=ABRT)
>>>>
>>>> 5月 16 15:30:42 chenqiang-worker-dev004-XXX systemd[1]: Unit
>>>> dcos-mesos-slave.service entered failed state.
>>>> 5月 16 15:30:42 chenqiang-worker-dev004-XXX systemd[1]:
>>>> dcos-mesos-slave.service failed.
>>>>
>>>> but can't find the root cause that make mesos-slave ABRT.
>>>>
>>>> On 2016年05月16日 14:38, 陈强 wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>>
>>>>> I installed a DC/OS cluster, but found that no agents were shown/found
>>>>> in cluster after finishing GUI installer.
>>>>> Other components are OK. Does someone met the issue? thanks.
>>>>>
>>>>>
>>>>> Best Regards.
>>>>> Chen, Qiang
>>>>>
>>>>>
>>>>
>>
> --
> 陈强
>
> 技术产品中心 计算云 弹性计算
> 手机:+86 15900964316
> 分机:8377
>
>


-- 
Best Regards,
Haosdent Huang

Re: No agents were shown/found in DC/OS cluster

Posted by 陈强 <ch...@qiyi.com>.
yes, I checked they configured use the same mesos-dns in 
/etc/resolv.conf and the mesos-dns is also running correctly. maybe it 
is not the root cause ?

On 2016\u5e7405\u670816\u65e5 15:57, \u9648\u5f3a wrote:
> Hi Stephen Gran,
>
> Yes, mesos-dns is running..
> I use DC/OS GUI installer, where should I configure or check to use 
> the mesos-dns for slaves? thx.
>
> [root@chenqiang-worker-dev007-shgq ~]# systemctl status dcos-mesos-dns -l
> \u25cf dcos-mesos-dns.service - Mesos DNS: DNS based Service Discovery
>    Loaded: loaded 
> (/opt/mesosphere/packages/mesos-dns--ee4c3c37c9be64426152a17a9aa094187357ae3b/dcos.target.wants_master/dcos-mesos-dns.service; 
> enabled; vendor preset: disabled)
>    Active: active (running) since \u4e00 2016-05-16 11:07:44 CST; 4h 45min 
> ago
>  Main PID: 14718 (mesos-dns)
>    Memory: 7.0M
>    CGroup: /system.slice/dcos-mesos-dns.service
>            \u2514\u250014718 /opt/mesosphere/bin/mesos-dns 
> --config=/opt/mesosphere/etc/mesos-dns.json -logtostderr=true
>
> 5\u6708 16 11:07:44 chenqiang-worker-dev007-XXX systemd[1]: Started Mesos 
> DNS: DNS based Service Discovery.
> 5\u6708 16 11:07:44 chenqiang-worker-dev007-XXX systemd[1]: Starting Mesos 
> DNS: DNS based Service Discovery...
> 5\u6708 16 11:07:44 chenqiang-worker-dev007-XXX mesos-dns[14718]: 
> 2016/05/16 11:07:44 Connected to 127.0.0.1:2181
> 5\u6708 16 11:07:44 chenqiang-worker-dev007-XXX mesos-dns[14718]: 
> 2016/05/16 11:07:44 Authenticated: id=95903287146905603, timeout=40000
>
>
>
> On 2016\u5e7405\u670816\u65e5 15:47, Stephen Gran wrote:
>> Hi,
>>
>> The ExecStartPre failed - it looks like dns isn't working for some
>> reason.  Can you check if mesos-dns is running and that the slaves are
>> configured to use it?
>>
>> Cheers,
>>
>> On 16/05/16 08:36, \u9648\u5f3a wrote:
>>> It seems the mesos-slave service doesn't start...
>>>
>>> [\u4e00 5\u6708 16 15:30:42 root@chenqiang-worker-dev004-XXX ~]# systemctl
>>> status dcos-mesos-slave.service -l
>>> \u25cf dcos-mesos-slave.service - Mesos Agent: DC/OS Mesos Agent Service
>>>       Loaded: loaded
>>> (/opt/mesosphere/packages/mesos--0335ca0d3700ea88ad8b808f3b1b84d747ed07f0/dcos.target.wants_slave/dcos-mesos-slave.service; 
>>>
>>> enabled; vendor preset: disabled)
>>>       Active: activating (auto-restart) (Result: signal) since \u4e00
>>> 2016-05-16 15:30:42 CST; 1s ago
>>>      Process: 8982
>>> ExecStart=/opt/mesosphere/packages/mesos--0335ca0d3700ea88ad8b808f3b1b84d747ed07f0/bin/mesos-slave 
>>>
>>> (code=killed, signal=ABRT)
>>>      Process: 8978 ExecStartPre=/bin/ping -c1 leader.mesos 
>>> (code=exited,
>>> status=0/SUCCESS)
>>>      Process: 8976 ExecStartPre=/bin/ping -c1 ready.spartan 
>>> (code=exited,
>>> status=0/SUCCESS)
>>>     Main PID: 8982 (code=killed, signal=ABRT)
>>>
>>> 5\u6708 16 15:30:42 chenqiang-worker-dev004-XXX systemd[1]: Unit
>>> dcos-mesos-slave.service entered failed state.
>>> 5\u6708 16 15:30:42 chenqiang-worker-dev004-XXX systemd[1]:
>>> dcos-mesos-slave.service failed.
>>>
>>> but can't find the root cause that make mesos-slave ABRT.
>>>
>>> On 2016\u5e7405\u670816\u65e5 14:38, \u9648\u5f3a wrote:
>>>> Hi all,
>>>>
>>>>
>>>> I installed a DC/OS cluster, but found that no agents were shown/found
>>>> in cluster after finishing GUI installer.
>>>> Other components are OK. Does someone met the issue? thanks.
>>>>
>>>>
>>>> Best Regards.
>>>> Chen, Qiang
>>>>
>>>
>

-- 
\u9648\u5f3a

\u6280\u672f\u4ea7\u54c1\u4e2d\u5fc3 \u8ba1\u7b97\u4e91 \u5f39\u6027\u8ba1\u7b97
\u624b\u673a\uff1a\uff0b86 15900964316
\u5206\u673a\uff1a8377


Re: No agents were shown/found in DC/OS cluster

Posted by 陈强 <ch...@qiyi.com>.
Hi Stephen Gran,

Yes, mesos-dns is running..
I use DC/OS GUI installer, where should I configure or check to use the 
mesos-dns for slaves? thx.

[root@chenqiang-worker-dev007-shgq ~]# systemctl status dcos-mesos-dns -l
\u25cf dcos-mesos-dns.service - Mesos DNS: DNS based Service Discovery
    Loaded: loaded 
(/opt/mesosphere/packages/mesos-dns--ee4c3c37c9be64426152a17a9aa094187357ae3b/dcos.target.wants_master/dcos-mesos-dns.service; 
enabled; vendor preset: disabled)
    Active: active (running) since \u4e00 2016-05-16 11:07:44 CST; 4h 45min ago
  Main PID: 14718 (mesos-dns)
    Memory: 7.0M
    CGroup: /system.slice/dcos-mesos-dns.service
            \u2514\u250014718 /opt/mesosphere/bin/mesos-dns 
--config=/opt/mesosphere/etc/mesos-dns.json -logtostderr=true

5\u6708 16 11:07:44 chenqiang-worker-dev007-XXX systemd[1]: Started Mesos 
DNS: DNS based Service Discovery.
5\u6708 16 11:07:44 chenqiang-worker-dev007-XXX systemd[1]: Starting Mesos 
DNS: DNS based Service Discovery...
5\u6708 16 11:07:44 chenqiang-worker-dev007-XXX mesos-dns[14718]: 2016/05/16 
11:07:44 Connected to 127.0.0.1:2181
5\u6708 16 11:07:44 chenqiang-worker-dev007-XXX mesos-dns[14718]: 2016/05/16 
11:07:44 Authenticated: id=95903287146905603, timeout=40000



On 2016\u5e7405\u670816\u65e5 15:47, Stephen Gran wrote:
> Hi,
>
> The ExecStartPre failed - it looks like dns isn't working for some
> reason.  Can you check if mesos-dns is running and that the slaves are
> configured to use it?
>
> Cheers,
>
> On 16/05/16 08:36, \u9648\u5f3a wrote:
>> It seems the mesos-slave service doesn't start...
>>
>> [\u4e00 5\u6708 16 15:30:42 root@chenqiang-worker-dev004-XXX ~]# systemctl
>> status dcos-mesos-slave.service -l
>> \u25cf dcos-mesos-slave.service - Mesos Agent: DC/OS Mesos Agent Service
>>       Loaded: loaded
>> (/opt/mesosphere/packages/mesos--0335ca0d3700ea88ad8b808f3b1b84d747ed07f0/dcos.target.wants_slave/dcos-mesos-slave.service;
>> enabled; vendor preset: disabled)
>>       Active: activating (auto-restart) (Result: signal) since \u4e00
>> 2016-05-16 15:30:42 CST; 1s ago
>>      Process: 8982
>> ExecStart=/opt/mesosphere/packages/mesos--0335ca0d3700ea88ad8b808f3b1b84d747ed07f0/bin/mesos-slave
>> (code=killed, signal=ABRT)
>>      Process: 8978 ExecStartPre=/bin/ping -c1 leader.mesos (code=exited,
>> status=0/SUCCESS)
>>      Process: 8976 ExecStartPre=/bin/ping -c1 ready.spartan (code=exited,
>> status=0/SUCCESS)
>>     Main PID: 8982 (code=killed, signal=ABRT)
>>
>> 5\u6708 16 15:30:42 chenqiang-worker-dev004-XXX systemd[1]: Unit
>> dcos-mesos-slave.service entered failed state.
>> 5\u6708 16 15:30:42 chenqiang-worker-dev004-XXX systemd[1]:
>> dcos-mesos-slave.service failed.
>>
>> but can't find the root cause that make mesos-slave ABRT.
>>
>> On 2016\u5e7405\u670816\u65e5 14:38, \u9648\u5f3a wrote:
>>> Hi all,
>>>
>>>
>>> I installed a DC/OS cluster, but found that no agents were shown/found
>>> in cluster after finishing GUI installer.
>>> Other components are OK. Does someone met the issue? thanks.
>>>
>>>
>>> Best Regards.
>>> Chen, Qiang
>>>
>>


Re: No agents were shown/found in DC/OS cluster

Posted by Stephen Gran <st...@piksel.com>.
Hi,

The ExecStartPre failed - it looks like dns isn't working for some 
reason.  Can you check if mesos-dns is running and that the slaves are 
configured to use it?

Cheers,

On 16/05/16 08:36, 陈强 wrote:
> It seems the mesos-slave service doesn't start...
>
> [一 5月 16 15:30:42 root@chenqiang-worker-dev004-XXX ~]# systemctl
> status dcos-mesos-slave.service -l
> ● dcos-mesos-slave.service - Mesos Agent: DC/OS Mesos Agent Service
>      Loaded: loaded
> (/opt/mesosphere/packages/mesos--0335ca0d3700ea88ad8b808f3b1b84d747ed07f0/dcos.target.wants_slave/dcos-mesos-slave.service;
> enabled; vendor preset: disabled)
>      Active: activating (auto-restart) (Result: signal) since 一
> 2016-05-16 15:30:42 CST; 1s ago
>     Process: 8982
> ExecStart=/opt/mesosphere/packages/mesos--0335ca0d3700ea88ad8b808f3b1b84d747ed07f0/bin/mesos-slave
> (code=killed, signal=ABRT)
>     Process: 8978 ExecStartPre=/bin/ping -c1 leader.mesos (code=exited,
> status=0/SUCCESS)
>     Process: 8976 ExecStartPre=/bin/ping -c1 ready.spartan (code=exited,
> status=0/SUCCESS)
>    Main PID: 8982 (code=killed, signal=ABRT)
>
> 5月 16 15:30:42 chenqiang-worker-dev004-XXX systemd[1]: Unit
> dcos-mesos-slave.service entered failed state.
> 5月 16 15:30:42 chenqiang-worker-dev004-XXX systemd[1]:
> dcos-mesos-slave.service failed.
>
> but can't find the root cause that make mesos-slave ABRT.
>
> On 2016年05月16日 14:38, 陈强 wrote:
>> Hi all,
>>
>>
>> I installed a DC/OS cluster, but found that no agents were shown/found
>> in cluster after finishing GUI installer.
>> Other components are OK. Does someone met the issue? thanks.
>>
>>
>> Best Regards.
>> Chen, Qiang
>>
>
>

-- 
Stephen Gran
Senior Technical Architect

picture the possibilities | piksel.com

Re: No agents were shown/found in DC/OS cluster

Posted by 陈强 <ch...@qiyi.com>.
It seems the mesos-slave service doesn't start...

[\u4e00 5\u6708 16 15:30:42 root@chenqiang-worker-dev004-XXX ~]# systemctl 
status dcos-mesos-slave.service -l
\u25cf dcos-mesos-slave.service - Mesos Agent: DC/OS Mesos Agent Service
    Loaded: loaded 
(/opt/mesosphere/packages/mesos--0335ca0d3700ea88ad8b808f3b1b84d747ed07f0/dcos.target.wants_slave/dcos-mesos-slave.service; 
enabled; vendor preset: disabled)
    Active: activating (auto-restart) (Result: signal) since \u4e00 
2016-05-16 15:30:42 CST; 1s ago
   Process: 8982 
ExecStart=/opt/mesosphere/packages/mesos--0335ca0d3700ea88ad8b808f3b1b84d747ed07f0/bin/mesos-slave 
(code=killed, signal=ABRT)
   Process: 8978 ExecStartPre=/bin/ping -c1 leader.mesos (code=exited, 
status=0/SUCCESS)
   Process: 8976 ExecStartPre=/bin/ping -c1 ready.spartan (code=exited, 
status=0/SUCCESS)
  Main PID: 8982 (code=killed, signal=ABRT)

5\u6708 16 15:30:42 chenqiang-worker-dev004-XXX systemd[1]: Unit 
dcos-mesos-slave.service entered failed state.
5\u6708 16 15:30:42 chenqiang-worker-dev004-XXX systemd[1]: 
dcos-mesos-slave.service failed.

but can't find the root cause that make mesos-slave ABRT.

On 2016\u5e7405\u670816\u65e5 14:38, \u9648\u5f3a wrote:
> Hi all,
>
>
> I installed a DC/OS cluster, but found that no agents were shown/found 
> in cluster after finishing GUI installer.
> Other components are OK. Does someone met the issue? thanks.
>
>
> Best Regards.
> Chen, Qiang
>