You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by Joerg Schad <jo...@mesosphere.io> on 2015/03/06 14:19:54 UTC
Re: Review Request 29507: Added Configurable Slave Ping Timeouts
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29507/#review75484
-----------------------------------------------------------
src/master/flags.hpp
<https://reviews.apache.org/r/29507/#comment122562>
Shouldn't this also be added to the documentation (i.e. http://mesos.apache.org/documentation/latest/configuration/)?
- Joerg Schad
On Feb. 19, 2015, 8:10 a.m., Adam B wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29507/
> -----------------------------------------------------------
>
> (Updated Feb. 19, 2015, 8:10 a.m.)
>
>
> Review request for mesos, Ben Mahler and Niklas Nielsen.
>
>
> Bugs: MESOS-2110
> https://issues.apache.org/jira/browse/MESOS-2110
>
>
> Repository: mesos
>
>
> Description
> -------
>
> Added new --slave_ping_timeout and --max_slave_ping_timeouts flags
> to mesos-master to supplement the DEFAULT_SLAVE_PING_TIMEOUT (15secs)
> and DEFAULT_MAX_SLAVE_PING_TIMEOUTS (5).
>
> These can be extended if slaves are expected/allowed to be down for
> longer than a minute or two.
>
> Slave will receive master's ping timeout in SlaveRe[re]gisteredMessage.
>
> Beware that this affects recovery from network timeouts as well as
> actual slave node/process failover.
>
>
> Diffs
> -----
>
> src/master/constants.hpp ad3fe81
> src/master/constants.cpp d3d0f71
> src/master/flags.hpp 51a6059
> src/master/master.cpp f10a3cf
> src/messages/messages.proto 58484ae
> src/slave/constants.hpp 12d6e92
> src/slave/constants.cpp 7868bef
> src/slave/slave.hpp 91dae10
> src/slave/slave.cpp aec9525
> src/tests/fault_tolerance_tests.cpp efa5c57
> src/tests/partition_tests.cpp eb16a58
> src/tests/slave_recovery_tests.cpp 8210c52
> src/tests/slave_tests.cpp 153d9d6
>
> Diff: https://reviews.apache.org/r/29507/diff/
>
>
> Testing
> -------
>
> Manually tested slave failover/shutdown with master using different --slave_ping_timeout and --max_slave_ping_timeouts.
> Ran unit tests with shorter non-default values for ping timeouts.
> `make check` with new unit tests: ShortPingTimeoutUnreachableMaster and ShortPingTimeoutUnreachableSlave
>
>
> Thanks,
>
> Adam B
>
>
Re: Review Request 29507: Added Configurable Slave Ping Timeouts
Posted by Adam B <ad...@mesosphere.io>.
> On March 6, 2015, 5:19 a.m., Joerg Schad wrote:
> > src/master/flags.hpp, line 383
> > <https://reviews.apache.org/r/29507/diff/4/?file=868836#file868836line383>
> >
> > Shouldn't this also be added to the documentation (i.e. http://mesos.apache.org/documentation/latest/configuration/)?
Excellent point. Will update that in the next patch revision.
- Adam
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29507/#review75484
-----------------------------------------------------------
On Feb. 19, 2015, 12:10 a.m., Adam B wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29507/
> -----------------------------------------------------------
>
> (Updated Feb. 19, 2015, 12:10 a.m.)
>
>
> Review request for mesos, Ben Mahler and Niklas Nielsen.
>
>
> Bugs: MESOS-2110
> https://issues.apache.org/jira/browse/MESOS-2110
>
>
> Repository: mesos
>
>
> Description
> -------
>
> Added new --slave_ping_timeout and --max_slave_ping_timeouts flags
> to mesos-master to supplement the DEFAULT_SLAVE_PING_TIMEOUT (15secs)
> and DEFAULT_MAX_SLAVE_PING_TIMEOUTS (5).
>
> These can be extended if slaves are expected/allowed to be down for
> longer than a minute or two.
>
> Slave will receive master's ping timeout in SlaveRe[re]gisteredMessage.
>
> Beware that this affects recovery from network timeouts as well as
> actual slave node/process failover.
>
>
> Diffs
> -----
>
> src/master/constants.hpp ad3fe81
> src/master/constants.cpp d3d0f71
> src/master/flags.hpp 51a6059
> src/master/master.cpp f10a3cf
> src/messages/messages.proto 58484ae
> src/slave/constants.hpp 12d6e92
> src/slave/constants.cpp 7868bef
> src/slave/slave.hpp 91dae10
> src/slave/slave.cpp aec9525
> src/tests/fault_tolerance_tests.cpp efa5c57
> src/tests/partition_tests.cpp eb16a58
> src/tests/slave_recovery_tests.cpp 8210c52
> src/tests/slave_tests.cpp 153d9d6
>
> Diff: https://reviews.apache.org/r/29507/diff/
>
>
> Testing
> -------
>
> Manually tested slave failover/shutdown with master using different --slave_ping_timeout and --max_slave_ping_timeouts.
> Ran unit tests with shorter non-default values for ping timeouts.
> `make check` with new unit tests: ShortPingTimeoutUnreachableMaster and ShortPingTimeoutUnreachableSlave
>
>
> Thanks,
>
> Adam B
>
>
Re: Review Request 29507: Added Configurable Slave Ping Timeouts
Posted by Joerg Schad <jo...@mesosphere.io>.
> On March 6, 2015, 1:19 p.m., Joerg Schad wrote:
> > src/master/flags.hpp, line 383
> > <https://reviews.apache.org/r/29507/diff/4/?file=868836#file868836line383>
> >
> > Shouldn't this also be added to the documentation (i.e. http://mesos.apache.org/documentation/latest/configuration/)?
>
> Adam B wrote:
> Excellent point. Will update that in the next patch revision.
Thanks for fixing!
- Joerg
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29507/#review75484
-----------------------------------------------------------
On May 14, 2015, 10:01 a.m., Adam B wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29507/
> -----------------------------------------------------------
>
> (Updated May 14, 2015, 10:01 a.m.)
>
>
> Review request for mesos, Ben Mahler and Niklas Nielsen.
>
>
> Bugs: MESOS-2110
> https://issues.apache.org/jira/browse/MESOS-2110
>
>
> Repository: mesos
>
>
> Description
> -------
>
> Added new --slave_ping_timeout and --max_slave_ping_timeouts flags
> to mesos-master to supplement the DEFAULT_SLAVE_PING_TIMEOUT (15secs)
> and DEFAULT_MAX_SLAVE_PING_TIMEOUTS (5).
>
> These can be extended if slaves are expected/allowed to be down for
> longer than a minute or two.
>
> Slave will receive master's ping timeout in SlaveRe[re]gisteredMessage.
>
> Beware that this affects recovery from network timeouts as well as
> actual slave node/process failover.
>
>
> Diffs
> -----
>
> docs/configuration.md 54c4e31
> docs/upgrades.md 2a15694
> src/master/constants.hpp c386eab
> src/master/constants.cpp 9ee17e9
> src/master/flags.hpp 996cf38
> src/master/flags.cpp 5798989
> src/master/master.cpp eaea79d
> src/messages/messages.proto 19e2444
> src/slave/constants.hpp df02043
> src/slave/constants.cpp 07f699a
> src/slave/slave.hpp b62ed7b
> src/slave/slave.cpp 132f83e
> src/tests/partition_tests.cpp f7ee3ab
> src/tests/slave_recovery_tests.cpp c036e9c
> src/tests/slave_tests.cpp acae497
>
> Diff: https://reviews.apache.org/r/29507/diff/
>
>
> Testing
> -------
>
> Manually tested slave failover/shutdown with master using different --slave_ping_timeout and --max_slave_ping_timeouts.
> Ran unit tests with shorter non-default values for ping timeouts.
> `make check` with new unit tests: ShortPingTimeoutUnreachableMaster and ShortPingTimeoutUnreachableSlave
>
>
> Thanks,
>
> Adam B
>
>