You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@mesos.apache.org by Neil Conway <ne...@gmail.com> on 2016/12/02 00:24:53 UTC

Re: Review Request 54183: Improved management of unreachable and completed tasks in master.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54183/
-----------------------------------------------------------

(Updated Dec. 2, 2016, 12:24 a.m.)


Review request for mesos and Vinod Kone.


Changes
-------

Fix a few places where we neglected to check `unreachableTasks`; improve tests.


Bugs: MESOS-6619
    https://issues.apache.org/jira/browse/MESOS-6619


Repository: mesos


Description
-------

Before partition-awareness, when an agent failed health checks, the
master removed the agent from the registry, marked all of its tasks
TASK_LOST, and moved them to the `completedTasks` list in the master's
memory. Although "lost" tasks might still be running, partitioned agents
would only be allowed to re-register if the master failed over, in which
case the `completedTasks` map would be emptied.

When partition-awareness was introduced, we initially followed the same
scheme, with the only difference that partition-aware tasks are marked
TASK_UNREACHABLE, not TASK_LOST.

This scheme has a few shortcomings. First, partition-aware tasks might
resume running when the partitioned agent re-registers. Second, we
re-added non-partition aware tasks when the agent re-registered but then
marked them completed when the framework is shutdown, resulting in two
entries in `completedTasks`.

This commit introduces a separate bounded map, `unreachableTasks`. These
tasks are reported separately via the HTTP endpoints, because they have
different semantics (unlike completed tasks, unreachable tasks can
resume running). The size of this map is limited by a new master flag,
`--max_unreachable_tasks_per_framework`. This commit also changes the
master to omit re-adding non-partition-aware tasks on re-registering
agents (unless the master has failed over): those tasks will shortly be
shutdown anyway.

Finally, this commit fixes a minor bug in the previous code: the
previous coding neglected to shutdown non-partition-aware frameworks
running on pre-1.0 Mesos agents that re-register with the master after
a network partition.


Diffs (updated)
-----

  docs/configuration.md efe3e9bd9d203a7ba44adf4ead24f14b8b577637 
  include/mesos/master/master.proto 3553c683c17004ac1831ec90271aa8584c950e53 
  include/mesos/v1/master/master.proto 022b491b7d5c49c5aeddf4ffc97c148f55629c95 
  src/master/constants.hpp 5dd0667f62d2c0617cc0d5aed8cc005bd8344c88 
  src/master/flags.hpp 6a17b763dc76daa10073394f416b049e97a44238 
  src/master/flags.cpp 9bfb40e22820c3ced40b128280ec63288fea8b41 
  src/master/http.cpp ac560d1fdd219d0de0c5d987a32a7112e149602f 
  src/master/master.hpp 877ca9010d0d6efc97f3d71fbd27272a255409d0 
  src/master/master.cpp e03a2e8025943825a2902102c43dc0eb66bacb6a 
  src/tests/partition_tests.cpp 5a0d4bd2de6a5aa0e9fdf0d34cd10d16fd4e34a1 

Diff: https://reviews.apache.org/r/54183/diff/


Testing
-------

`make check`


Thanks,

Neil Conway

Re: Review Request 54183: Improved management of unreachable and completed tasks in master.

Posted by Vinod Kone <vi...@gmail.com>.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54183/#review162098
-----------------------------------------------------------



- Vinod Kone


On Jan. 17, 2017, 6:36 p.m., Neil Conway wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54183/
> -----------------------------------------------------------
> 
> (Updated Jan. 17, 2017, 6:36 p.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-6619
>     https://issues.apache.org/jira/browse/MESOS-6619
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Before partition-awareness, when an agent failed health checks, the
> master removed the agent from the registry, marked all of its tasks
> TASK_LOST, and moved them to the `completedTasks` list in the master's
> memory. Although "lost" tasks might still be running, partitioned agents
> would only be allowed to re-register if the master failed over, in which
> case the `completedTasks` map would be emptied.
> 
> When partition-awareness was introduced, we initially followed the same
> scheme, with the only difference that partition-aware tasks are marked
> TASK_UNREACHABLE, not TASK_LOST.
> 
> This scheme has a few shortcomings. First, partition-aware tasks might
> resume running when the partitioned agent re-registers. Second, we
> re-added non-partition aware tasks when the agent re-registered but then
> marked them completed when the framework is shutdown, resulting in two
> entries in `completedTasks`.
> 
> This commit introduces a separate bounded map, `unreachableTasks`. These
> tasks are reported separately via the HTTP endpoints, because they have
> different semantics (unlike completed tasks, unreachable tasks can
> resume running). The size of this map is limited by a new master flag,
> `--max_unreachable_tasks_per_framework`. This commit also changes the
> master to omit re-adding non-partition-aware tasks on re-registering
> agents (unless the master has failed over): those tasks will shortly be
> shutdown anyway.
> 
> Finally, this commit fixes a minor bug in the previous code: the
> previous coding neglected to shutdown non-partition-aware frameworks
> running on pre-1.0 Mesos agents that re-register with the master after
> a network partition.
> 
> 
> Diffs
> -----
> 
>   docs/configuration.md 2113d06f58dddc0a28ae1241a24096266fe39801 
>   include/mesos/master/master.proto 03203c8ddd05876339b1c56789b7bb140499d49d 
>   include/mesos/v1/master/master.proto f8edf39a68752c8601cece345f52bce8b9a8a68b 
>   src/master/constants.hpp 900b69460de9280f30b16b86925e436054e080ca 
>   src/master/flags.hpp 6a17b763dc76daa10073394f416b049e97a44238 
>   src/master/flags.cpp 124bfbca6be9d5d60597b9a1793bd807202da8ec 
>   src/master/http.cpp a44621f39cb059e654a56f57f75b38947f3a4230 
>   src/master/master.hpp 44f4fecb1fbe8bebf830990a59a5462338e6e004 
>   src/master/master.cpp b863ff6e93931c3d1ee056248084c7f44caf2fd9 
>   src/tests/partition_tests.cpp 72013d1bfee275c6f3cb90173f0c408d55e0bc5d 
> 
> Diff: https://reviews.apache.org/r/54183/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Neil Conway
> 
>

Re: Review Request 54183: Improved management of unreachable and completed tasks in master.

Posted by Vinod Kone <vi...@gmail.com>.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54183/#review162304
-----------------------------------------------------------


Ship it!




Ship It!

- Vinod Kone


On Jan. 18, 2017, 7:18 p.m., Neil Conway wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54183/
> -----------------------------------------------------------
> 
> (Updated Jan. 18, 2017, 7:18 p.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-6619
>     https://issues.apache.org/jira/browse/MESOS-6619
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Before partition-awareness, when an agent failed health checks, the
> master removed the agent from the registry, marked all of its tasks
> TASK_LOST, and moved them to the `completedTasks` list in the master's
> memory. Although "lost" tasks might still be running, partitioned agents
> would only be allowed to re-register if the master failed over, in which
> case the `completedTasks` map would be emptied.
> 
> When partition-awareness was introduced, we initially followed the same
> scheme, with the only difference that partition-aware tasks are marked
> TASK_UNREACHABLE, not TASK_LOST.
> 
> This scheme has a few shortcomings. First, partition-aware tasks might
> resume running when the partitioned agent re-registers. Second, we
> re-added non-partition aware tasks when the agent re-registered but then
> marked them completed when the framework is shutdown, resulting in two
> entries in `completedTasks`.
> 
> This commit introduces a separate bounded map, `unreachableTasks`. These
> tasks are reported separately via the HTTP endpoints, because they have
> different semantics (unlike completed tasks, unreachable tasks can
> resume running). The size of this map is limited by a new master flag,
> `--max_unreachable_tasks_per_framework`. This commit also changes the
> master to omit re-adding non-partition-aware tasks on re-registering
> agents (unless the master has failed over): those tasks will shortly be
> shutdown anyway.
> 
> Finally, this commit fixes a minor bug in the previous code: the
> previous coding neglected to shutdown non-partition-aware frameworks
> running on pre-1.0 Mesos agents that re-register with the master after
> a network partition.
> 
> 
> Diffs
> -----
> 
>   docs/configuration.md 2113d06f58dddc0a28ae1241a24096266fe39801 
>   include/mesos/master/master.proto 03203c8ddd05876339b1c56789b7bb140499d49d 
>   include/mesos/v1/master/master.proto f8edf39a68752c8601cece345f52bce8b9a8a68b 
>   src/master/constants.hpp 900b69460de9280f30b16b86925e436054e080ca 
>   src/master/flags.hpp 6a17b763dc76daa10073394f416b049e97a44238 
>   src/master/flags.cpp 124bfbca6be9d5d60597b9a1793bd807202da8ec 
>   src/master/http.cpp a44621f39cb059e654a56f57f75b38947f3a4230 
>   src/master/master.hpp 8e8a9037af11cf95961b6498540a0fd486ed091b 
>   src/master/master.cpp 73159328ce3fd838e02eba0e6a30cf69efc319ba 
>   src/tests/partition_tests.cpp 72013d1bfee275c6f3cb90173f0c408d55e0bc5d 
> 
> Diff: https://reviews.apache.org/r/54183/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Neil Conway
> 
>

Re: Review Request 54183: Improved management of unreachable and completed tasks in master.

Posted by Neil Conway <ne...@gmail.com>.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54183/
-----------------------------------------------------------

(Updated Jan. 18, 2017, 7:18 p.m.)


Review request for mesos and Vinod Kone.


Changes
-------

Remove helper functions.


Bugs: MESOS-6619
    https://issues.apache.org/jira/browse/MESOS-6619


Repository: mesos


Description
-------

Before partition-awareness, when an agent failed health checks, the
master removed the agent from the registry, marked all of its tasks
TASK_LOST, and moved them to the `completedTasks` list in the master's
memory. Although "lost" tasks might still be running, partitioned agents
would only be allowed to re-register if the master failed over, in which
case the `completedTasks` map would be emptied.

When partition-awareness was introduced, we initially followed the same
scheme, with the only difference that partition-aware tasks are marked
TASK_UNREACHABLE, not TASK_LOST.

This scheme has a few shortcomings. First, partition-aware tasks might
resume running when the partitioned agent re-registers. Second, we
re-added non-partition aware tasks when the agent re-registered but then
marked them completed when the framework is shutdown, resulting in two
entries in `completedTasks`.

This commit introduces a separate bounded map, `unreachableTasks`. These
tasks are reported separately via the HTTP endpoints, because they have
different semantics (unlike completed tasks, unreachable tasks can
resume running). The size of this map is limited by a new master flag,
`--max_unreachable_tasks_per_framework`. This commit also changes the
master to omit re-adding non-partition-aware tasks on re-registering
agents (unless the master has failed over): those tasks will shortly be
shutdown anyway.

Finally, this commit fixes a minor bug in the previous code: the
previous coding neglected to shutdown non-partition-aware frameworks
running on pre-1.0 Mesos agents that re-register with the master after
a network partition.


Diffs (updated)
-----

  docs/configuration.md 2113d06f58dddc0a28ae1241a24096266fe39801 
  include/mesos/master/master.proto 03203c8ddd05876339b1c56789b7bb140499d49d 
  include/mesos/v1/master/master.proto f8edf39a68752c8601cece345f52bce8b9a8a68b 
  src/master/constants.hpp 900b69460de9280f30b16b86925e436054e080ca 
  src/master/flags.hpp 6a17b763dc76daa10073394f416b049e97a44238 
  src/master/flags.cpp 124bfbca6be9d5d60597b9a1793bd807202da8ec 
  src/master/http.cpp a44621f39cb059e654a56f57f75b38947f3a4230 
  src/master/master.hpp 8e8a9037af11cf95961b6498540a0fd486ed091b 
  src/master/master.cpp 73159328ce3fd838e02eba0e6a30cf69efc319ba 
  src/tests/partition_tests.cpp 72013d1bfee275c6f3cb90173f0c408d55e0bc5d 

Diff: https://reviews.apache.org/r/54183/diff/


Testing
-------

`make check`


Thanks,

Neil Conway

Re: Review Request 54183: Improved management of unreachable and completed tasks in master.

Posted by Neil Conway <ne...@gmail.com>.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54183/
-----------------------------------------------------------

(Updated Jan. 17, 2017, 6:36 p.m.)


Review request for mesos and Vinod Kone.


Changes
-------

Improve comments.


Bugs: MESOS-6619
    https://issues.apache.org/jira/browse/MESOS-6619


Repository: mesos


Description
-------

Before partition-awareness, when an agent failed health checks, the
master removed the agent from the registry, marked all of its tasks
TASK_LOST, and moved them to the `completedTasks` list in the master's
memory. Although "lost" tasks might still be running, partitioned agents
would only be allowed to re-register if the master failed over, in which
case the `completedTasks` map would be emptied.

When partition-awareness was introduced, we initially followed the same
scheme, with the only difference that partition-aware tasks are marked
TASK_UNREACHABLE, not TASK_LOST.

This scheme has a few shortcomings. First, partition-aware tasks might
resume running when the partitioned agent re-registers. Second, we
re-added non-partition aware tasks when the agent re-registered but then
marked them completed when the framework is shutdown, resulting in two
entries in `completedTasks`.

This commit introduces a separate bounded map, `unreachableTasks`. These
tasks are reported separately via the HTTP endpoints, because they have
different semantics (unlike completed tasks, unreachable tasks can
resume running). The size of this map is limited by a new master flag,
`--max_unreachable_tasks_per_framework`. This commit also changes the
master to omit re-adding non-partition-aware tasks on re-registering
agents (unless the master has failed over): those tasks will shortly be
shutdown anyway.

Finally, this commit fixes a minor bug in the previous code: the
previous coding neglected to shutdown non-partition-aware frameworks
running on pre-1.0 Mesos agents that re-register with the master after
a network partition.


Diffs (updated)
-----

  docs/configuration.md 2113d06f58dddc0a28ae1241a24096266fe39801 
  include/mesos/master/master.proto 03203c8ddd05876339b1c56789b7bb140499d49d 
  include/mesos/v1/master/master.proto f8edf39a68752c8601cece345f52bce8b9a8a68b 
  src/master/constants.hpp 900b69460de9280f30b16b86925e436054e080ca 
  src/master/flags.hpp 6a17b763dc76daa10073394f416b049e97a44238 
  src/master/flags.cpp 124bfbca6be9d5d60597b9a1793bd807202da8ec 
  src/master/http.cpp a44621f39cb059e654a56f57f75b38947f3a4230 
  src/master/master.hpp 44f4fecb1fbe8bebf830990a59a5462338e6e004 
  src/master/master.cpp b863ff6e93931c3d1ee056248084c7f44caf2fd9 
  src/tests/partition_tests.cpp 72013d1bfee275c6f3cb90173f0c408d55e0bc5d 

Diff: https://reviews.apache.org/r/54183/diff/


Testing
-------

`make check`


Thanks,

Neil Conway

Re: Review Request 54183: Improved management of unreachable and completed tasks in master.

Posted by Neil Conway <ne...@gmail.com>.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54183/
-----------------------------------------------------------

(Updated Jan. 17, 2017, 5:24 p.m.)


Review request for mesos and Vinod Kone.


Changes
-------

Rebase.


Bugs: MESOS-6619
    https://issues.apache.org/jira/browse/MESOS-6619


Repository: mesos


Description
-------

Before partition-awareness, when an agent failed health checks, the
master removed the agent from the registry, marked all of its tasks
TASK_LOST, and moved them to the `completedTasks` list in the master's
memory. Although "lost" tasks might still be running, partitioned agents
would only be allowed to re-register if the master failed over, in which
case the `completedTasks` map would be emptied.

When partition-awareness was introduced, we initially followed the same
scheme, with the only difference that partition-aware tasks are marked
TASK_UNREACHABLE, not TASK_LOST.

This scheme has a few shortcomings. First, partition-aware tasks might
resume running when the partitioned agent re-registers. Second, we
re-added non-partition aware tasks when the agent re-registered but then
marked them completed when the framework is shutdown, resulting in two
entries in `completedTasks`.

This commit introduces a separate bounded map, `unreachableTasks`. These
tasks are reported separately via the HTTP endpoints, because they have
different semantics (unlike completed tasks, unreachable tasks can
resume running). The size of this map is limited by a new master flag,
`--max_unreachable_tasks_per_framework`. This commit also changes the
master to omit re-adding non-partition-aware tasks on re-registering
agents (unless the master has failed over): those tasks will shortly be
shutdown anyway.

Finally, this commit fixes a minor bug in the previous code: the
previous coding neglected to shutdown non-partition-aware frameworks
running on pre-1.0 Mesos agents that re-register with the master after
a network partition.


Diffs (updated)
-----

  docs/configuration.md 2113d06f58dddc0a28ae1241a24096266fe39801 
  include/mesos/master/master.proto 03203c8ddd05876339b1c56789b7bb140499d49d 
  include/mesos/v1/master/master.proto f8edf39a68752c8601cece345f52bce8b9a8a68b 
  src/master/constants.hpp 900b69460de9280f30b16b86925e436054e080ca 
  src/master/flags.hpp 6a17b763dc76daa10073394f416b049e97a44238 
  src/master/flags.cpp 124bfbca6be9d5d60597b9a1793bd807202da8ec 
  src/master/http.cpp a44621f39cb059e654a56f57f75b38947f3a4230 
  src/master/master.hpp 44f4fecb1fbe8bebf830990a59a5462338e6e004 
  src/master/master.cpp b863ff6e93931c3d1ee056248084c7f44caf2fd9 
  src/tests/partition_tests.cpp 72013d1bfee275c6f3cb90173f0c408d55e0bc5d 

Diff: https://reviews.apache.org/r/54183/diff/


Testing
-------

`make check`


Thanks,

Neil Conway

Re: Review Request 54183: Improved management of unreachable and completed tasks in master.

Posted by Neil Conway <ne...@gmail.com>.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54183/
-----------------------------------------------------------

(Updated Jan. 9, 2017, 10:39 p.m.)


Review request for mesos and Vinod Kone.


Changes
-------

Address review comments, unbreak test cases.


Bugs: MESOS-6619
    https://issues.apache.org/jira/browse/MESOS-6619


Repository: mesos


Description
-------

Before partition-awareness, when an agent failed health checks, the
master removed the agent from the registry, marked all of its tasks
TASK_LOST, and moved them to the `completedTasks` list in the master's
memory. Although "lost" tasks might still be running, partitioned agents
would only be allowed to re-register if the master failed over, in which
case the `completedTasks` map would be emptied.

When partition-awareness was introduced, we initially followed the same
scheme, with the only difference that partition-aware tasks are marked
TASK_UNREACHABLE, not TASK_LOST.

This scheme has a few shortcomings. First, partition-aware tasks might
resume running when the partitioned agent re-registers. Second, we
re-added non-partition aware tasks when the agent re-registered but then
marked them completed when the framework is shutdown, resulting in two
entries in `completedTasks`.

This commit introduces a separate bounded map, `unreachableTasks`. These
tasks are reported separately via the HTTP endpoints, because they have
different semantics (unlike completed tasks, unreachable tasks can
resume running). The size of this map is limited by a new master flag,
`--max_unreachable_tasks_per_framework`. This commit also changes the
master to omit re-adding non-partition-aware tasks on re-registering
agents (unless the master has failed over): those tasks will shortly be
shutdown anyway.

Finally, this commit fixes a minor bug in the previous code: the
previous coding neglected to shutdown non-partition-aware frameworks
running on pre-1.0 Mesos agents that re-register with the master after
a network partition.


Diffs (updated)
-----

  docs/configuration.md 2113d06f58dddc0a28ae1241a24096266fe39801 
  include/mesos/master/master.proto 03203c8ddd05876339b1c56789b7bb140499d49d 
  include/mesos/v1/master/master.proto f8edf39a68752c8601cece345f52bce8b9a8a68b 
  src/master/constants.hpp 900b69460de9280f30b16b86925e436054e080ca 
  src/master/flags.hpp 6a17b763dc76daa10073394f416b049e97a44238 
  src/master/flags.cpp 124bfbca6be9d5d60597b9a1793bd807202da8ec 
  src/master/http.cpp 75dcd6199dbfcca6260baed49b838a45312bb667 
  src/master/master.hpp 368ee1d5e97784fa54e0f141906405ee8f104317 
  src/master/master.cpp 0c2210bbcd0622f704497daf78fe959cde99ff7f 
  src/tests/partition_tests.cpp 72013d1bfee275c6f3cb90173f0c408d55e0bc5d 

Diff: https://reviews.apache.org/r/54183/diff/


Testing
-------

`make check`


Thanks,

Neil Conway

Re: Review Request 54183: Improved management of unreachable and completed tasks in master.

Posted by Vinod Kone <vi...@gmail.com>.


> On Jan. 3, 2017, 10:25 p.m., Vinod Kone wrote:
> > src/master/master.cpp, line 5507
> > <https://reviews.apache.org/r/54183/diff/6/?file=1588710#file1588710line5507>
> >
> >     can we inline this?
> 
> Neil Conway wrote:
>     We could, but personally I find it more readable to make it a separate function. The logic is a bit involved here (since we need to check two different data structures due to backward compatibility requirements), and there's already a lot going on in `Master::_reregisterSlave`.
> 
> Vinod Kone wrote:
>     In general, we tend to factor out something into a function if it is reusable and generic, not to reduce the number of lines someone has to read in a function. In fact the argument was to inline in such cases because it will avoid the need to context switch. 
>     
>     Both `findPartitonAwareFrameworks` and `filterPartitionTasks` seem very specific to the `_reregisterSlave` method, hence the suggestion.
>     
>     For example, instead of looping through all the agent's tasks 3 times to figure out which tasks to add to `Slave*`, it's probably easier to do it in one inline loop here?
> 
> Neil Conway wrote:
>     My goal isn't to reduce the # of lines to read the function, but to raise the level of abstraction a little bit. For example, `findPartitonAwareFrameworks` has to check _two_ data structures, depending on both agent version and whether the master has failed over. That is quite strange and takes a bit of thought to validate. With a separate function, the reader of `_reregisterSlave` doesn't need to think about those details if they don't want to -- instead they can just think "okay, we find all the partition-aware frameworks on the agent and then ..."
>     
>     I'm fine with looping over the agent's tasks three times, because each of the three things we want to do is logically independent -- it takes a bit of thought to validate whether it is actually correct to combine them into a single loop (because each loop iteration would access a collection that is modified by the loop itself. It would actually be okay, but IMO current approach is safer).

This is the code I had in mind. Let me know what you think:

```
// Add recovered frameworks into a hashset for easy lookup.
hashset<FrameworkID, FrameworkInfo> frameworks_;
foreach (const FrameworkInfo& framework, frameworks) {
  frameworks_.put(framework.id(), framework);
}

vector<Task> tasks_;
foreach (const Task& task, tasks) {
  const FrameworkInfo& frameworkInfo = frameworks_[task.framework_id()];
  Framework* framework = getFramework(framework.id());
  
  if (protobuf::frameworkHasCapability(frameworkInfo, PARTITION_AWARE)) {
    // Add the task if it is partition aware.
    tasks_.push_back(task);
    
    // Remove the partition aware task from `unreachableTasks`.
    if (framework != null) {
      framework->unreachableTasks.erase(task.task_id());
    }
  } else if (slaves.removed.get(slaveInfo.id()).isSome()) {
    // Add a non-partion task iff the master failed over;
    tasks_.push_back(task);
  }
}
  
Slave* slave = new Slave(
  this,
  slaveInfo,
  pid,
  machineId,
  version,
  Clock::now(),
  checkpointedResources,
  executorInfos,
  tasks_);
}

```


- Vinod


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54183/#review160019
-----------------------------------------------------------


On Jan. 17, 2017, 6:36 p.m., Neil Conway wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54183/
> -----------------------------------------------------------
> 
> (Updated Jan. 17, 2017, 6:36 p.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-6619
>     https://issues.apache.org/jira/browse/MESOS-6619
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Before partition-awareness, when an agent failed health checks, the
> master removed the agent from the registry, marked all of its tasks
> TASK_LOST, and moved them to the `completedTasks` list in the master's
> memory. Although "lost" tasks might still be running, partitioned agents
> would only be allowed to re-register if the master failed over, in which
> case the `completedTasks` map would be emptied.
> 
> When partition-awareness was introduced, we initially followed the same
> scheme, with the only difference that partition-aware tasks are marked
> TASK_UNREACHABLE, not TASK_LOST.
> 
> This scheme has a few shortcomings. First, partition-aware tasks might
> resume running when the partitioned agent re-registers. Second, we
> re-added non-partition aware tasks when the agent re-registered but then
> marked them completed when the framework is shutdown, resulting in two
> entries in `completedTasks`.
> 
> This commit introduces a separate bounded map, `unreachableTasks`. These
> tasks are reported separately via the HTTP endpoints, because they have
> different semantics (unlike completed tasks, unreachable tasks can
> resume running). The size of this map is limited by a new master flag,
> `--max_unreachable_tasks_per_framework`. This commit also changes the
> master to omit re-adding non-partition-aware tasks on re-registering
> agents (unless the master has failed over): those tasks will shortly be
> shutdown anyway.
> 
> Finally, this commit fixes a minor bug in the previous code: the
> previous coding neglected to shutdown non-partition-aware frameworks
> running on pre-1.0 Mesos agents that re-register with the master after
> a network partition.
> 
> 
> Diffs
> -----
> 
>   docs/configuration.md 2113d06f58dddc0a28ae1241a24096266fe39801 
>   include/mesos/master/master.proto 03203c8ddd05876339b1c56789b7bb140499d49d 
>   include/mesos/v1/master/master.proto f8edf39a68752c8601cece345f52bce8b9a8a68b 
>   src/master/constants.hpp 900b69460de9280f30b16b86925e436054e080ca 
>   src/master/flags.hpp 6a17b763dc76daa10073394f416b049e97a44238 
>   src/master/flags.cpp 124bfbca6be9d5d60597b9a1793bd807202da8ec 
>   src/master/http.cpp a44621f39cb059e654a56f57f75b38947f3a4230 
>   src/master/master.hpp 44f4fecb1fbe8bebf830990a59a5462338e6e004 
>   src/master/master.cpp b863ff6e93931c3d1ee056248084c7f44caf2fd9 
>   src/tests/partition_tests.cpp 72013d1bfee275c6f3cb90173f0c408d55e0bc5d 
> 
> Diff: https://reviews.apache.org/r/54183/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Neil Conway
> 
>

Re: Review Request 54183: Improved management of unreachable and completed tasks in master.

Posted by Neil Conway <ne...@gmail.com>.


> On Jan. 3, 2017, 10:25 p.m., Vinod Kone wrote:
> > src/master/master.hpp, lines 2634-2641
> > <https://reviews.apache.org/r/54183/diff/6/?file=1588709#file1588709line2634>
> >
> >     this reads like `unreachableTasks` are completed tasks of PA frameworks. can you split this comment up and move the unreachable specific comments to #2644?

I added an additional comment describing the purpose of `unreachableTasks`, but I felt like it was useful to still keep a note in the `completedTasks` comment to describe which tasks are stored in `unreachableTasks` vs `completedTasks` when an agent is marked unreachable. Let me know if you think this is still confusing.


> On Jan. 3, 2017, 10:25 p.m., Vinod Kone wrote:
> > src/master/master.cpp, line 5496
> > <https://reviews.apache.org/r/54183/diff/6/?file=1588710#file1588710line5496>
> >
> >     it's not necessary that the agent has failed over if we are here right?

Do you mean "it's not necessary that the master has failed over if we are here"?

If so, that's right: the master may or may not have re-registered in the time since the agent was marked unreachable. The comment talks about how these two cases are handled differently.


> On Jan. 3, 2017, 10:25 p.m., Vinod Kone wrote:
> > src/master/master.cpp, line 5507
> > <https://reviews.apache.org/r/54183/diff/6/?file=1588710#file1588710line5507>
> >
> >     can we inline this?

We could, but personally I find it more readable to make it a separate function. The logic is a bit involved here (since we need to check two different data structures due to backward compatibility requirements), and there's already a lot going on in `Master::_reregisterSlave`.


> On Jan. 3, 2017, 10:25 p.m., Vinod Kone wrote:
> > src/master/master.cpp, line 5539
> > <https://reviews.apache.org/r/54183/diff/6/?file=1588710#file1588710line5539>
> >
> >     what if the agent was marked unreachable by the previous master? do we still want to add all its tasks?

If the master has failed over, all tasks on the agent will be allowed to continue running (whether partition-aware or not) -- so that's why we re-add all tasks here.


> On Jan. 3, 2017, 10:25 p.m., Vinod Kone wrote:
> > src/tests/partition_tests.cpp, lines 1819-1847
> > <https://reviews.apache.org/r/54183/diff/6/?file=1588711#file1588711line1819>
> >
> >     hmm. it is unfortunate that we send TASK_LOST followed by TASK_FINISHED and then send TASK_LOST on reconciliation. can you remind me why the master forwards status updates for unknown tasks? looks like it can just drop them if the reason for doing so is no longer valid.

I don't know why we forward status updates for unknown tasks. I can see it having some value, though -- in the scenario above, the first `TASK_LOST` just means the agent become unreachable. `TASK_FINISHED` tells the framework that the task actually did complete successfully. I can imagine situations where, say, the framework ignores `TASK_LOST` or uses a large timeout before assuming the task is "really lost", whereas `TASK_FINISHED` / `TASK_KILLED` / etc. is a more definitive signal.


> On Jan. 3, 2017, 10:25 p.m., Vinod Kone wrote:
> > src/master/master.cpp, line 5523
> > <https://reviews.apache.org/r/54183/diff/6/?file=1588710#file1588710line5523>
> >
> >     not sure if this is worth making into a function?

I thought it was a bit more readable (against, `Master::_reregisterSlave` has a lot of stuff going on), but happy to change it if you disagree.


- Neil


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54183/#review160019
-----------------------------------------------------------


On Dec. 18, 2016, 11:29 p.m., Neil Conway wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54183/
> -----------------------------------------------------------
> 
> (Updated Dec. 18, 2016, 11:29 p.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-6619
>     https://issues.apache.org/jira/browse/MESOS-6619
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Before partition-awareness, when an agent failed health checks, the
> master removed the agent from the registry, marked all of its tasks
> TASK_LOST, and moved them to the `completedTasks` list in the master's
> memory. Although "lost" tasks might still be running, partitioned agents
> would only be allowed to re-register if the master failed over, in which
> case the `completedTasks` map would be emptied.
> 
> When partition-awareness was introduced, we initially followed the same
> scheme, with the only difference that partition-aware tasks are marked
> TASK_UNREACHABLE, not TASK_LOST.
> 
> This scheme has a few shortcomings. First, partition-aware tasks might
> resume running when the partitioned agent re-registers. Second, we
> re-added non-partition aware tasks when the agent re-registered but then
> marked them completed when the framework is shutdown, resulting in two
> entries in `completedTasks`.
> 
> This commit introduces a separate bounded map, `unreachableTasks`. These
> tasks are reported separately via the HTTP endpoints, because they have
> different semantics (unlike completed tasks, unreachable tasks can
> resume running). The size of this map is limited by a new master flag,
> `--max_unreachable_tasks_per_framework`. This commit also changes the
> master to omit re-adding non-partition-aware tasks on re-registering
> agents (unless the master has failed over): those tasks will shortly be
> shutdown anyway.
> 
> Finally, this commit fixes a minor bug in the previous code: the
> previous coding neglected to shutdown non-partition-aware frameworks
> running on pre-1.0 Mesos agents that re-register with the master after
> a network partition.
> 
> 
> Diffs
> -----
> 
>   docs/configuration.md efe3e9bd9d203a7ba44adf4ead24f14b8b577637 
>   include/mesos/master/master.proto 483dc2f59684481c9f549de9f06eef4dda8a46e1 
>   include/mesos/v1/master/master.proto 09a82af88303a2d971da7c56a7075d7005932363 
>   src/master/constants.hpp 5dd0667f62d2c0617cc0d5aed8cc005bd8344c88 
>   src/master/flags.hpp 6a17b763dc76daa10073394f416b049e97a44238 
>   src/master/flags.cpp e5edf3333d0a0c529a10dc602ef9a88a0ec60c69 
>   src/master/http.cpp ee559804c490d53be2d12f233ae0bfb93ed9f96d 
>   src/master/master.hpp 89b3c394b268a8645885412aeb19980db8d73c69 
>   src/master/master.cpp b664550d57ef9805bd23ea35ca7f9cd8f4b1ab78 
>   src/tests/partition_tests.cpp e1e2025bd0f078836323cbd8c6d7836815c4c38d 
> 
> Diff: https://reviews.apache.org/r/54183/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Neil Conway
> 
>

Re: Review Request 54183: Improved management of unreachable and completed tasks in master.

Posted by Neil Conway <ne...@gmail.com>.


> On Jan. 3, 2017, 10:25 p.m., Vinod Kone wrote:
> > src/master/master.cpp, line 5507
> > <https://reviews.apache.org/r/54183/diff/6/?file=1588710#file1588710line5507>
> >
> >     can we inline this?
> 
> Neil Conway wrote:
>     We could, but personally I find it more readable to make it a separate function. The logic is a bit involved here (since we need to check two different data structures due to backward compatibility requirements), and there's already a lot going on in `Master::_reregisterSlave`.
> 
> Vinod Kone wrote:
>     In general, we tend to factor out something into a function if it is reusable and generic, not to reduce the number of lines someone has to read in a function. In fact the argument was to inline in such cases because it will avoid the need to context switch. 
>     
>     Both `findPartitonAwareFrameworks` and `filterPartitionTasks` seem very specific to the `_reregisterSlave` method, hence the suggestion.
>     
>     For example, instead of looping through all the agent's tasks 3 times to figure out which tasks to add to `Slave*`, it's probably easier to do it in one inline loop here?
> 
> Neil Conway wrote:
>     My goal isn't to reduce the # of lines to read the function, but to raise the level of abstraction a little bit. For example, `findPartitonAwareFrameworks` has to check _two_ data structures, depending on both agent version and whether the master has failed over. That is quite strange and takes a bit of thought to validate. With a separate function, the reader of `_reregisterSlave` doesn't need to think about those details if they don't want to -- instead they can just think "okay, we find all the partition-aware frameworks on the agent and then ..."
>     
>     I'm fine with looping over the agent's tasks three times, because each of the three things we want to do is logically independent -- it takes a bit of thought to validate whether it is actually correct to combine them into a single loop (because each loop iteration would access a collection that is modified by the loop itself. It would actually be okay, but IMO current approach is safer).
> 
> Vinod Kone wrote:
>     This is the code I had in mind. Let me know what you think:
>     
>     ```
>     // Add recovered frameworks into a hashset for easy lookup.
>     hashset<FrameworkID, FrameworkInfo> frameworks_;
>     foreach (const FrameworkInfo& framework, frameworks) {
>       frameworks_.put(framework.id(), framework);
>     }
>     
>     vector<Task> tasks_;
>     foreach (const Task& task, tasks) {
>       const FrameworkInfo& frameworkInfo = frameworks_[task.framework_id()];
>       Framework* framework = getFramework(framework.id());
>       
>       if (protobuf::frameworkHasCapability(frameworkInfo, PARTITION_AWARE)) {
>         // Add the task if it is partition aware.
>         tasks_.push_back(task);
>         
>         // Remove the partition aware task from `unreachableTasks`.
>         if (framework != null) {
>           framework->unreachableTasks.erase(task.task_id());
>         }
>       } else if (slaves.removed.get(slaveInfo.id()).isSome()) {
>         // Add a non-partion task iff the master failed over;
>         tasks_.push_back(task);
>       }
>     }
>       
>     Slave* slave = new Slave(
>       this,
>       slaveInfo,
>       pid,
>       machineId,
>       version,
>       Clock::now(),
>       checkpointedResources,
>       executorInfos,
>       tasks_);
>     }
>     
>     ```

Revised the RR to use something similar to the above.


- Neil


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54183/#review160019
-----------------------------------------------------------


On Jan. 18, 2017, 7:18 p.m., Neil Conway wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54183/
> -----------------------------------------------------------
> 
> (Updated Jan. 18, 2017, 7:18 p.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-6619
>     https://issues.apache.org/jira/browse/MESOS-6619
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Before partition-awareness, when an agent failed health checks, the
> master removed the agent from the registry, marked all of its tasks
> TASK_LOST, and moved them to the `completedTasks` list in the master's
> memory. Although "lost" tasks might still be running, partitioned agents
> would only be allowed to re-register if the master failed over, in which
> case the `completedTasks` map would be emptied.
> 
> When partition-awareness was introduced, we initially followed the same
> scheme, with the only difference that partition-aware tasks are marked
> TASK_UNREACHABLE, not TASK_LOST.
> 
> This scheme has a few shortcomings. First, partition-aware tasks might
> resume running when the partitioned agent re-registers. Second, we
> re-added non-partition aware tasks when the agent re-registered but then
> marked them completed when the framework is shutdown, resulting in two
> entries in `completedTasks`.
> 
> This commit introduces a separate bounded map, `unreachableTasks`. These
> tasks are reported separately via the HTTP endpoints, because they have
> different semantics (unlike completed tasks, unreachable tasks can
> resume running). The size of this map is limited by a new master flag,
> `--max_unreachable_tasks_per_framework`. This commit also changes the
> master to omit re-adding non-partition-aware tasks on re-registering
> agents (unless the master has failed over): those tasks will shortly be
> shutdown anyway.
> 
> Finally, this commit fixes a minor bug in the previous code: the
> previous coding neglected to shutdown non-partition-aware frameworks
> running on pre-1.0 Mesos agents that re-register with the master after
> a network partition.
> 
> 
> Diffs
> -----
> 
>   docs/configuration.md 2113d06f58dddc0a28ae1241a24096266fe39801 
>   include/mesos/master/master.proto 03203c8ddd05876339b1c56789b7bb140499d49d 
>   include/mesos/v1/master/master.proto f8edf39a68752c8601cece345f52bce8b9a8a68b 
>   src/master/constants.hpp 900b69460de9280f30b16b86925e436054e080ca 
>   src/master/flags.hpp 6a17b763dc76daa10073394f416b049e97a44238 
>   src/master/flags.cpp 124bfbca6be9d5d60597b9a1793bd807202da8ec 
>   src/master/http.cpp a44621f39cb059e654a56f57f75b38947f3a4230 
>   src/master/master.hpp 8e8a9037af11cf95961b6498540a0fd486ed091b 
>   src/master/master.cpp 73159328ce3fd838e02eba0e6a30cf69efc319ba 
>   src/tests/partition_tests.cpp 72013d1bfee275c6f3cb90173f0c408d55e0bc5d 
> 
> Diff: https://reviews.apache.org/r/54183/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Neil Conway
> 
>

Re: Review Request 54183: Improved management of unreachable and completed tasks in master.

Posted by Neil Conway <ne...@gmail.com>.


> On Jan. 3, 2017, 10:25 p.m., Vinod Kone wrote:
> > src/master/master.cpp, line 5507
> > <https://reviews.apache.org/r/54183/diff/6/?file=1588710#file1588710line5507>
> >
> >     can we inline this?
> 
> Neil Conway wrote:
>     We could, but personally I find it more readable to make it a separate function. The logic is a bit involved here (since we need to check two different data structures due to backward compatibility requirements), and there's already a lot going on in `Master::_reregisterSlave`.
> 
> Vinod Kone wrote:
>     In general, we tend to factor out something into a function if it is reusable and generic, not to reduce the number of lines someone has to read in a function. In fact the argument was to inline in such cases because it will avoid the need to context switch. 
>     
>     Both `findPartitonAwareFrameworks` and `filterPartitionTasks` seem very specific to the `_reregisterSlave` method, hence the suggestion.
>     
>     For example, instead of looping through all the agent's tasks 3 times to figure out which tasks to add to `Slave*`, it's probably easier to do it in one inline loop here?

My goal isn't to reduce the # of lines to read the function, but to raise the level of abstraction a little bit. For example, `findPartitonAwareFrameworks` has to check _two_ data structures, depending on both agent version and whether the master has failed over. That is quite strange and takes a bit of thought to validate. With a separate function, the reader of `_reregisterSlave` doesn't need to think about those details if they don't want to -- instead they can just think "okay, we find all the partition-aware frameworks on the agent and then ..."

I'm fine with looping over the agent's tasks three times, because each of the three things we want to do is logically independent -- it takes a bit of thought to validate whether it is actually correct to combine them into a single loop (because each loop iteration would access a collection that is modified by the loop itself. It would actually be okay, but IMO current approach is safer).


- Neil


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54183/#review160019
-----------------------------------------------------------


On Jan. 9, 2017, 10:39 p.m., Neil Conway wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54183/
> -----------------------------------------------------------
> 
> (Updated Jan. 9, 2017, 10:39 p.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-6619
>     https://issues.apache.org/jira/browse/MESOS-6619
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Before partition-awareness, when an agent failed health checks, the
> master removed the agent from the registry, marked all of its tasks
> TASK_LOST, and moved them to the `completedTasks` list in the master's
> memory. Although "lost" tasks might still be running, partitioned agents
> would only be allowed to re-register if the master failed over, in which
> case the `completedTasks` map would be emptied.
> 
> When partition-awareness was introduced, we initially followed the same
> scheme, with the only difference that partition-aware tasks are marked
> TASK_UNREACHABLE, not TASK_LOST.
> 
> This scheme has a few shortcomings. First, partition-aware tasks might
> resume running when the partitioned agent re-registers. Second, we
> re-added non-partition aware tasks when the agent re-registered but then
> marked them completed when the framework is shutdown, resulting in two
> entries in `completedTasks`.
> 
> This commit introduces a separate bounded map, `unreachableTasks`. These
> tasks are reported separately via the HTTP endpoints, because they have
> different semantics (unlike completed tasks, unreachable tasks can
> resume running). The size of this map is limited by a new master flag,
> `--max_unreachable_tasks_per_framework`. This commit also changes the
> master to omit re-adding non-partition-aware tasks on re-registering
> agents (unless the master has failed over): those tasks will shortly be
> shutdown anyway.
> 
> Finally, this commit fixes a minor bug in the previous code: the
> previous coding neglected to shutdown non-partition-aware frameworks
> running on pre-1.0 Mesos agents that re-register with the master after
> a network partition.
> 
> 
> Diffs
> -----
> 
>   docs/configuration.md 2113d06f58dddc0a28ae1241a24096266fe39801 
>   include/mesos/master/master.proto 03203c8ddd05876339b1c56789b7bb140499d49d 
>   include/mesos/v1/master/master.proto f8edf39a68752c8601cece345f52bce8b9a8a68b 
>   src/master/constants.hpp 900b69460de9280f30b16b86925e436054e080ca 
>   src/master/flags.hpp 6a17b763dc76daa10073394f416b049e97a44238 
>   src/master/flags.cpp 124bfbca6be9d5d60597b9a1793bd807202da8ec 
>   src/master/http.cpp 75dcd6199dbfcca6260baed49b838a45312bb667 
>   src/master/master.hpp 368ee1d5e97784fa54e0f141906405ee8f104317 
>   src/master/master.cpp 0c2210bbcd0622f704497daf78fe959cde99ff7f 
>   src/tests/partition_tests.cpp 72013d1bfee275c6f3cb90173f0c408d55e0bc5d 
> 
> Diff: https://reviews.apache.org/r/54183/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Neil Conway
> 
>

Re: Review Request 54183: Improved management of unreachable and completed tasks in master.

Posted by Vinod Kone <vi...@gmail.com>.


> On Jan. 3, 2017, 10:25 p.m., Vinod Kone wrote:
> > src/master/master.cpp, line 5496
> > <https://reviews.apache.org/r/54183/diff/6/?file=1588710#file1588710line5496>
> >
> >     it's not necessary that the agent has failed over if we are here right?
> 
> Neil Conway wrote:
>     Do you mean "it's not necessary that the master has failed over if we are here"?
>     
>     If so, that's right: the master may or may not have re-registered in the time since the agent was marked unreachable. The comment talks about how these two cases are handled differently.

I forgot what I was asking here originally :( Lets drop it for now.


> On Jan. 3, 2017, 10:25 p.m., Vinod Kone wrote:
> > src/master/master.hpp, lines 2634-2641
> > <https://reviews.apache.org/r/54183/diff/6/?file=1588709#file1588709line2634>
> >
> >     this reads like `unreachableTasks` are completed tasks of PA frameworks. can you split this comment up and move the unreachable specific comments to #2644?
> 
> Neil Conway wrote:
>     I added an additional comment describing the purpose of `unreachableTasks`, but I felt like it was useful to still keep a note in the `completedTasks` comment to describe which tasks are stored in `unreachableTasks` vs `completedTasks` when an agent is marked unreachable. Let me know if you think this is still confusing.

lgtm.


> On Jan. 3, 2017, 10:25 p.m., Vinod Kone wrote:
> > src/master/master.cpp, line 5507
> > <https://reviews.apache.org/r/54183/diff/6/?file=1588710#file1588710line5507>
> >
> >     can we inline this?
> 
> Neil Conway wrote:
>     We could, but personally I find it more readable to make it a separate function. The logic is a bit involved here (since we need to check two different data structures due to backward compatibility requirements), and there's already a lot going on in `Master::_reregisterSlave`.

In general, we tend to factor out something into a function if it is reusable and generic, not to reduce the number of lines someone has to read in a function. In fact the argument was to inline in such cases because it will avoid the need to context switch. 

Both `findPartitonAwareFrameworks` and `filterPartitionTasks` seem very specific to the `_reregisterSlave` method, hence the suggestion.

For example, instead of looping through all the agent's tasks 3 times to figure out which tasks to add to `Slave*`, it's probably easier to do it in one inline loop here?


> On Jan. 3, 2017, 10:25 p.m., Vinod Kone wrote:
> > src/master/master.cpp, line 5523
> > <https://reviews.apache.org/r/54183/diff/6/?file=1588710#file1588710line5523>
> >
> >     not sure if this is worth making into a function?
> 
> Neil Conway wrote:
>     I thought it was a bit more readable (against, `Master::_reregisterSlave` has a lot of stuff going on), but happy to change it if you disagree.

see above.


> On Jan. 3, 2017, 10:25 p.m., Vinod Kone wrote:
> > src/master/master.cpp, line 5539
> > <https://reviews.apache.org/r/54183/diff/6/?file=1588710#file1588710line5539>
> >
> >     what if the agent was marked unreachable by the previous master? do we still want to add all its tasks?
> 
> Neil Conway wrote:
>     If the master has failed over, all tasks on the agent will be allowed to continue running (whether partition-aware or not) -- so that's why we re-add all tasks here.

I see.


> On Jan. 3, 2017, 10:25 p.m., Vinod Kone wrote:
> > src/tests/partition_tests.cpp, lines 1819-1847
> > <https://reviews.apache.org/r/54183/diff/6/?file=1588711#file1588711line1819>
> >
> >     hmm. it is unfortunate that we send TASK_LOST followed by TASK_FINISHED and then send TASK_LOST on reconciliation. can you remind me why the master forwards status updates for unknown tasks? looks like it can just drop them if the reason for doing so is no longer valid.
> 
> Neil Conway wrote:
>     I don't know why we forward status updates for unknown tasks. I can see it having some value, though -- in the scenario above, the first `TASK_LOST` just means the agent become unreachable. `TASK_FINISHED` tells the framework that the task actually did complete successfully. I can imagine situations where, say, the framework ignores `TASK_LOST` or uses a large timeout before assuming the task is "really lost", whereas `TASK_FINISHED` / `TASK_KILLED` / etc. is a more definitive signal.

I see. Atleat the PA world is much saner. TASK_UNREACHABLE -> TASK_FINISHED -> TASK_UNKNOWN


- Vinod


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54183/#review160019
-----------------------------------------------------------


On Jan. 9, 2017, 10:39 p.m., Neil Conway wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54183/
> -----------------------------------------------------------
> 
> (Updated Jan. 9, 2017, 10:39 p.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-6619
>     https://issues.apache.org/jira/browse/MESOS-6619
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Before partition-awareness, when an agent failed health checks, the
> master removed the agent from the registry, marked all of its tasks
> TASK_LOST, and moved them to the `completedTasks` list in the master's
> memory. Although "lost" tasks might still be running, partitioned agents
> would only be allowed to re-register if the master failed over, in which
> case the `completedTasks` map would be emptied.
> 
> When partition-awareness was introduced, we initially followed the same
> scheme, with the only difference that partition-aware tasks are marked
> TASK_UNREACHABLE, not TASK_LOST.
> 
> This scheme has a few shortcomings. First, partition-aware tasks might
> resume running when the partitioned agent re-registers. Second, we
> re-added non-partition aware tasks when the agent re-registered but then
> marked them completed when the framework is shutdown, resulting in two
> entries in `completedTasks`.
> 
> This commit introduces a separate bounded map, `unreachableTasks`. These
> tasks are reported separately via the HTTP endpoints, because they have
> different semantics (unlike completed tasks, unreachable tasks can
> resume running). The size of this map is limited by a new master flag,
> `--max_unreachable_tasks_per_framework`. This commit also changes the
> master to omit re-adding non-partition-aware tasks on re-registering
> agents (unless the master has failed over): those tasks will shortly be
> shutdown anyway.
> 
> Finally, this commit fixes a minor bug in the previous code: the
> previous coding neglected to shutdown non-partition-aware frameworks
> running on pre-1.0 Mesos agents that re-register with the master after
> a network partition.
> 
> 
> Diffs
> -----
> 
>   docs/configuration.md 2113d06f58dddc0a28ae1241a24096266fe39801 
>   include/mesos/master/master.proto 03203c8ddd05876339b1c56789b7bb140499d49d 
>   include/mesos/v1/master/master.proto f8edf39a68752c8601cece345f52bce8b9a8a68b 
>   src/master/constants.hpp 900b69460de9280f30b16b86925e436054e080ca 
>   src/master/flags.hpp 6a17b763dc76daa10073394f416b049e97a44238 
>   src/master/flags.cpp 124bfbca6be9d5d60597b9a1793bd807202da8ec 
>   src/master/http.cpp 75dcd6199dbfcca6260baed49b838a45312bb667 
>   src/master/master.hpp 368ee1d5e97784fa54e0f141906405ee8f104317 
>   src/master/master.cpp 0c2210bbcd0622f704497daf78fe959cde99ff7f 
>   src/tests/partition_tests.cpp 72013d1bfee275c6f3cb90173f0c408d55e0bc5d 
> 
> Diff: https://reviews.apache.org/r/54183/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Neil Conway
> 
>

Re: Review Request 54183: Improved management of unreachable and completed tasks in master.

Posted by Vinod Kone <vi...@gmail.com>.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54183/#review160019
-----------------------------------------------------------




src/master/master.hpp (lines 2634 - 2641)
<https://reviews.apache.org/r/54183/#comment231030>

    this reads like `unreachableTasks` are completed tasks of PA frameworks. can you split this comment up and move the unreachable specific comments to #2644?



src/master/master.cpp (line 5496)
<https://reviews.apache.org/r/54183/#comment231032>

    it's not necessary that the agent has failed over if we are here right?



src/master/master.cpp (line 5507)
<https://reviews.apache.org/r/54183/#comment231039>

    can we inline this?



src/master/master.cpp (lines 5513 - 5514)
<https://reviews.apache.org/r/54183/#comment231036>

    can you make this a TODO?



src/master/master.cpp (line 5523)
<https://reviews.apache.org/r/54183/#comment231038>

    not sure if this is worth making into a function?



src/master/master.cpp (line 5539)
<https://reviews.apache.org/r/54183/#comment231037>

    what if the agent was marked unreachable by the previous master? do we still want to add all its tasks?



src/tests/partition_tests.cpp (lines 1819 - 1847)
<https://reviews.apache.org/r/54183/#comment231581>

    hmm. it is unfortunate that we send TASK_LOST followed by TASK_FINISHED and then send TASK_LOST on reconciliation. can you remind me why the master forwards status updates for unknown tasks? looks like it can just drop them if the reason for doing so is no longer valid.


- Vinod Kone


On Dec. 18, 2016, 11:29 p.m., Neil Conway wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54183/
> -----------------------------------------------------------
> 
> (Updated Dec. 18, 2016, 11:29 p.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-6619
>     https://issues.apache.org/jira/browse/MESOS-6619
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Before partition-awareness, when an agent failed health checks, the
> master removed the agent from the registry, marked all of its tasks
> TASK_LOST, and moved them to the `completedTasks` list in the master's
> memory. Although "lost" tasks might still be running, partitioned agents
> would only be allowed to re-register if the master failed over, in which
> case the `completedTasks` map would be emptied.
> 
> When partition-awareness was introduced, we initially followed the same
> scheme, with the only difference that partition-aware tasks are marked
> TASK_UNREACHABLE, not TASK_LOST.
> 
> This scheme has a few shortcomings. First, partition-aware tasks might
> resume running when the partitioned agent re-registers. Second, we
> re-added non-partition aware tasks when the agent re-registered but then
> marked them completed when the framework is shutdown, resulting in two
> entries in `completedTasks`.
> 
> This commit introduces a separate bounded map, `unreachableTasks`. These
> tasks are reported separately via the HTTP endpoints, because they have
> different semantics (unlike completed tasks, unreachable tasks can
> resume running). The size of this map is limited by a new master flag,
> `--max_unreachable_tasks_per_framework`. This commit also changes the
> master to omit re-adding non-partition-aware tasks on re-registering
> agents (unless the master has failed over): those tasks will shortly be
> shutdown anyway.
> 
> Finally, this commit fixes a minor bug in the previous code: the
> previous coding neglected to shutdown non-partition-aware frameworks
> running on pre-1.0 Mesos agents that re-register with the master after
> a network partition.
> 
> 
> Diffs
> -----
> 
>   docs/configuration.md efe3e9bd9d203a7ba44adf4ead24f14b8b577637 
>   include/mesos/master/master.proto 483dc2f59684481c9f549de9f06eef4dda8a46e1 
>   include/mesos/v1/master/master.proto 09a82af88303a2d971da7c56a7075d7005932363 
>   src/master/constants.hpp 5dd0667f62d2c0617cc0d5aed8cc005bd8344c88 
>   src/master/flags.hpp 6a17b763dc76daa10073394f416b049e97a44238 
>   src/master/flags.cpp e5edf3333d0a0c529a10dc602ef9a88a0ec60c69 
>   src/master/http.cpp ee559804c490d53be2d12f233ae0bfb93ed9f96d 
>   src/master/master.hpp 89b3c394b268a8645885412aeb19980db8d73c69 
>   src/master/master.cpp b664550d57ef9805bd23ea35ca7f9cd8f4b1ab78 
>   src/tests/partition_tests.cpp e1e2025bd0f078836323cbd8c6d7836815c4c38d 
> 
> Diff: https://reviews.apache.org/r/54183/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Neil Conway
> 
>

Re: Review Request 54183: Improved management of unreachable and completed tasks in master.

Posted by Neil Conway <ne...@gmail.com>.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54183/
-----------------------------------------------------------

(Updated Dec. 18, 2016, 11:29 p.m.)


Review request for mesos and Vinod Kone.


Changes
-------

Rebase.


Bugs: MESOS-6619
    https://issues.apache.org/jira/browse/MESOS-6619


Repository: mesos


Description
-------

Before partition-awareness, when an agent failed health checks, the
master removed the agent from the registry, marked all of its tasks
TASK_LOST, and moved them to the `completedTasks` list in the master's
memory. Although "lost" tasks might still be running, partitioned agents
would only be allowed to re-register if the master failed over, in which
case the `completedTasks` map would be emptied.

When partition-awareness was introduced, we initially followed the same
scheme, with the only difference that partition-aware tasks are marked
TASK_UNREACHABLE, not TASK_LOST.

This scheme has a few shortcomings. First, partition-aware tasks might
resume running when the partitioned agent re-registers. Second, we
re-added non-partition aware tasks when the agent re-registered but then
marked them completed when the framework is shutdown, resulting in two
entries in `completedTasks`.

This commit introduces a separate bounded map, `unreachableTasks`. These
tasks are reported separately via the HTTP endpoints, because they have
different semantics (unlike completed tasks, unreachable tasks can
resume running). The size of this map is limited by a new master flag,
`--max_unreachable_tasks_per_framework`. This commit also changes the
master to omit re-adding non-partition-aware tasks on re-registering
agents (unless the master has failed over): those tasks will shortly be
shutdown anyway.

Finally, this commit fixes a minor bug in the previous code: the
previous coding neglected to shutdown non-partition-aware frameworks
running on pre-1.0 Mesos agents that re-register with the master after
a network partition.


Diffs (updated)
-----

  docs/configuration.md efe3e9bd9d203a7ba44adf4ead24f14b8b577637 
  include/mesos/master/master.proto 483dc2f59684481c9f549de9f06eef4dda8a46e1 
  include/mesos/v1/master/master.proto 09a82af88303a2d971da7c56a7075d7005932363 
  src/master/constants.hpp 5dd0667f62d2c0617cc0d5aed8cc005bd8344c88 
  src/master/flags.hpp 6a17b763dc76daa10073394f416b049e97a44238 
  src/master/flags.cpp e5edf3333d0a0c529a10dc602ef9a88a0ec60c69 
  src/master/http.cpp ee559804c490d53be2d12f233ae0bfb93ed9f96d 
  src/master/master.hpp 89b3c394b268a8645885412aeb19980db8d73c69 
  src/master/master.cpp b664550d57ef9805bd23ea35ca7f9cd8f4b1ab78 
  src/tests/partition_tests.cpp e1e2025bd0f078836323cbd8c6d7836815c4c38d 

Diff: https://reviews.apache.org/r/54183/diff/


Testing
-------

`make check`


Thanks,

Neil Conway

Re: Review Request 54183: Improved management of unreachable and completed tasks in master.

Posted by Neil Conway <ne...@gmail.com>.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54183/
-----------------------------------------------------------

(Updated Dec. 12, 2016, 7:06 p.m.)


Review request for mesos and Vinod Kone.


Changes
-------

Rebase, tweak comment.


Bugs: MESOS-6619
    https://issues.apache.org/jira/browse/MESOS-6619


Repository: mesos


Description
-------

Before partition-awareness, when an agent failed health checks, the
master removed the agent from the registry, marked all of its tasks
TASK_LOST, and moved them to the `completedTasks` list in the master's
memory. Although "lost" tasks might still be running, partitioned agents
would only be allowed to re-register if the master failed over, in which
case the `completedTasks` map would be emptied.

When partition-awareness was introduced, we initially followed the same
scheme, with the only difference that partition-aware tasks are marked
TASK_UNREACHABLE, not TASK_LOST.

This scheme has a few shortcomings. First, partition-aware tasks might
resume running when the partitioned agent re-registers. Second, we
re-added non-partition aware tasks when the agent re-registered but then
marked them completed when the framework is shutdown, resulting in two
entries in `completedTasks`.

This commit introduces a separate bounded map, `unreachableTasks`. These
tasks are reported separately via the HTTP endpoints, because they have
different semantics (unlike completed tasks, unreachable tasks can
resume running). The size of this map is limited by a new master flag,
`--max_unreachable_tasks_per_framework`. This commit also changes the
master to omit re-adding non-partition-aware tasks on re-registering
agents (unless the master has failed over): those tasks will shortly be
shutdown anyway.

Finally, this commit fixes a minor bug in the previous code: the
previous coding neglected to shutdown non-partition-aware frameworks
running on pre-1.0 Mesos agents that re-register with the master after
a network partition.


Diffs (updated)
-----

  docs/configuration.md efe3e9bd9d203a7ba44adf4ead24f14b8b577637 
  include/mesos/master/master.proto 0d33251a9016ab99a1d70f15637d55f41caefb63 
  include/mesos/v1/master/master.proto 09a82af88303a2d971da7c56a7075d7005932363 
  src/master/constants.hpp 5dd0667f62d2c0617cc0d5aed8cc005bd8344c88 
  src/master/flags.hpp 6a17b763dc76daa10073394f416b049e97a44238 
  src/master/flags.cpp e5edf3333d0a0c529a10dc602ef9a88a0ec60c69 
  src/master/http.cpp d52806dcf8e4d64ebb98e191a01408c0fcae17ac 
  src/master/master.hpp c304e69af0b7a720ea8277088cabc0675eec8b57 
  src/master/master.cpp 8c1c7f94102a2f40fbdffaa36f2d1c15e78a906d 
  src/tests/partition_tests.cpp 00cc815529dc4d303db638680eacb8f55713d1a1 

Diff: https://reviews.apache.org/r/54183/diff/


Testing
-------

`make check`


Thanks,

Neil Conway

Re: Review Request 54183: Improved management of unreachable and completed tasks in master.

Posted by Neil Conway <ne...@gmail.com>.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54183/
-----------------------------------------------------------

(Updated Dec. 9, 2016, 2:38 p.m.)


Review request for mesos and Vinod Kone.


Changes
-------

Rebase.


Bugs: MESOS-6619
    https://issues.apache.org/jira/browse/MESOS-6619


Repository: mesos


Description
-------

Before partition-awareness, when an agent failed health checks, the
master removed the agent from the registry, marked all of its tasks
TASK_LOST, and moved them to the `completedTasks` list in the master's
memory. Although "lost" tasks might still be running, partitioned agents
would only be allowed to re-register if the master failed over, in which
case the `completedTasks` map would be emptied.

When partition-awareness was introduced, we initially followed the same
scheme, with the only difference that partition-aware tasks are marked
TASK_UNREACHABLE, not TASK_LOST.

This scheme has a few shortcomings. First, partition-aware tasks might
resume running when the partitioned agent re-registers. Second, we
re-added non-partition aware tasks when the agent re-registered but then
marked them completed when the framework is shutdown, resulting in two
entries in `completedTasks`.

This commit introduces a separate bounded map, `unreachableTasks`. These
tasks are reported separately via the HTTP endpoints, because they have
different semantics (unlike completed tasks, unreachable tasks can
resume running). The size of this map is limited by a new master flag,
`--max_unreachable_tasks_per_framework`. This commit also changes the
master to omit re-adding non-partition-aware tasks on re-registering
agents (unless the master has failed over): those tasks will shortly be
shutdown anyway.

Finally, this commit fixes a minor bug in the previous code: the
previous coding neglected to shutdown non-partition-aware frameworks
running on pre-1.0 Mesos agents that re-register with the master after
a network partition.


Diffs (updated)
-----

  docs/configuration.md efe3e9bd9d203a7ba44adf4ead24f14b8b577637 
  include/mesos/master/master.proto 0d33251a9016ab99a1d70f15637d55f41caefb63 
  include/mesos/v1/master/master.proto 09a82af88303a2d971da7c56a7075d7005932363 
  src/master/constants.hpp 5dd0667f62d2c0617cc0d5aed8cc005bd8344c88 
  src/master/flags.hpp 6a17b763dc76daa10073394f416b049e97a44238 
  src/master/flags.cpp e5edf3333d0a0c529a10dc602ef9a88a0ec60c69 
  src/master/http.cpp d8cff310ecadf5bb301af9f2aa47acdf2dcd42d2 
  src/master/master.hpp d3c20d7cbb9137d3b05cae3d67f004d5e7475c8a 
  src/master/master.cpp 353e6ea802e197b4456c1647f78d9984a50f1c9d 
  src/tests/partition_tests.cpp 00cc815529dc4d303db638680eacb8f55713d1a1 

Diff: https://reviews.apache.org/r/54183/diff/


Testing
-------

`make check`


Thanks,

Neil Conway