You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@mesos.apache.org by "Adam B (JIRA)" <ji...@apache.org> on 2015/10/13 09:43:05 UTC

[jira] [Commented] (MESOS-1826) Improve logging for when master cannot connect to slaves

    [ https://issues.apache.org/jira/browse/MESOS-1826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954562#comment-14954562 ] 

Adam B commented on MESOS-1826:
-------------------------------

[~gyliu] I think you can reproduce this if you just set `LIBPROCESS_IP=127.0.0.1` when starting a slave on a different machine from the master. Although the slave knows the master's IP and can send RegisterSlave messages, the SlaveRegistered acknowledgement never reaches the slave, so it keeps retrying its registration.

> Improve logging for when master cannot connect to slaves
> --------------------------------------------------------
>
>                 Key: MESOS-1826
>                 URL: https://issues.apache.org/jira/browse/MESOS-1826
>             Project: Mesos
>          Issue Type: Improvement
>    Affects Versions: 0.20.0
>            Reporter: Thomas Rampelberg
>            Assignee: Guangya Liu
>            Priority: Minor
>              Labels: newbie
>
> When first setting a mesos cluster up, it is possible to get into a state where your slaves are constantly re-registering. This happens because the slave pid is not reachable from the master.
> Currently, the master logs make it pretty tough to figure out that this is the problem that is occurring. It would be fantastic if there was a better explanation in the logs, something like:
>     Unable to connect to slave X at x.x.x.x:5051. Please make sure that host is reachable from your master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)