You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Greg Mann <gr...@mesosphere.io> on 2019/03/07 00:10:57 UTC
Review Request 70147: WIP: Added a Sequence to the master to order
updates to agent resources.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70147/
-----------------------------------------------------------
Review request for mesos, Benjamin Mahler, Gastón Kleiman, Joseph Wu, and Meng Zhu.
Bugs: MESOS-9460
https://issues.apache.org/jira/browse/MESOS-9460
Repository: mesos
Description
-------
This patch adds a new `Sequence` data member to the master
which is used to prevent interleavings of master/allocator
state updates which could lead to inconsistent state in
the master and allocator actors.
Diffs
-----
src/master/master.hpp 90e08149ece595147ca4a93da215385917a0f372
src/master/master.cpp b9db4ffd4ee8ea4a8e44a35d1afb6c1b8e03d74d
Diff: https://reviews.apache.org/r/70147/diff/1/
Testing
-------
`bin/mesos-tests.sh --gtest_filter="*SpeculativeOperationRacesWithUpdateSlaveMessage*" --gtest_repeat=-1 --gtest_break_on_failure`
Thanks,
Greg Mann
Re: Review Request 70147: WIP: Added a Sequence to the master to order
updates to agent resources.
Posted by Greg Mann <gr...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70147/
-----------------------------------------------------------
(Updated March 7, 2019, 12:26 a.m.)
Review request for mesos, Benjamin Mahler, Gastón Kleiman, Joseph Wu, and Meng Zhu.
Bugs: MESOS-9460
https://issues.apache.org/jira/browse/MESOS-9460
Repository: mesos
Description (updated)
-------
This patch adds a new `Sequence` data member to the master
which is used to prevent interleavings of master/allocator
state updates which could lead to inconsistent state in
the master and allocator actors.
For example, the following interleaving of events would
previously lead to inconsistent state between the master
and allocator:
1) Master receives a RESERVE operation for agent A via the
operator API. This invokes `Master::apply()`, which
calls `allocator->updateAvailable()` for agent A.
2) Master receives an `UpdateSlaveMessage` containing
oversubscribed resources from agent A. The
`Master::updateSlave()` handler invokes
`allocator->updateSlave()` which uses _stale_ resources
from the `Slave` struct to update the allocator's view
of agent A's resources. Once that event is processed by
the allocator, the allocator will not include the
reserved resources in agent A's total.
3) After the `allocator->updateAvailable()` call from #1
returns, `Master::_apply()` is invoked, which updates
the `Slave` struct for agent A to include the reserved
resources. The master and allocator's views of agent
A's total resources are now inconsistent.
Diffs
-----
src/master/master.hpp 90e08149ece595147ca4a93da215385917a0f372
src/master/master.cpp b9db4ffd4ee8ea4a8e44a35d1afb6c1b8e03d74d
Diff: https://reviews.apache.org/r/70147/diff/1/
Testing
-------
`bin/mesos-tests.sh --gtest_filter="*SpeculativeOperationRacesWithUpdateSlaveMessage*" --gtest_repeat=-1 --gtest_break_on_failure`
Thanks,
Greg Mann
Re: Review Request 70147: WIP: Added a Sequence to the master to order
updates to agent resources.
Posted by Greg Mann <gr...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70147/
-----------------------------------------------------------
(Updated March 7, 2019, 12:25 a.m.)
Review request for mesos, Benjamin Mahler, Gastón Kleiman, Joseph Wu, and Meng Zhu.
Bugs: MESOS-9460
https://issues.apache.org/jira/browse/MESOS-9460
Repository: mesos
Description (updated)
-------
This patch adds a new `Sequence` data member to the master
which is used to prevent interleavings of master/allocator
state updates which could lead to inconsistent state in
the master and allocator actors.
For example, the following interleaving of events would
previously lead to inconsistent state between the master
and allocator:
1) Master receives a RESERVE operation for agent A via the
operator API. This invokes `Master::apply()`, which
calls `allocator->updateAvailable()` for agent A.
2) Master receives an `UpdateSlaveMessage` containing
oversubscribed resources from agent A. The handler
`Master::updateSlave()` invokes
`allocator->updateSlave()` which uses _stale_ resources
from the `Slave` struct to update the allocator's view
of agent A's resources. Once that event is processed by
the allocator, the allocator will not include the
reserved resources in agent A's total.
3) After the `allocator->updateAvailable()` call from #1
returns, `Master::_apply()` is invoked, which updates
the `Slave` struct for agent A to include the reserved
resources. The master and allocator's views of agent
A's total resources are now inconsistent.
Diffs
-----
src/master/master.hpp 90e08149ece595147ca4a93da215385917a0f372
src/master/master.cpp b9db4ffd4ee8ea4a8e44a35d1afb6c1b8e03d74d
Diff: https://reviews.apache.org/r/70147/diff/1/
Testing
-------
`bin/mesos-tests.sh --gtest_filter="*SpeculativeOperationRacesWithUpdateSlaveMessage*" --gtest_repeat=-1 --gtest_break_on_failure`
Thanks,
Greg Mann