You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Benjamin Bannier <be...@mesosphere.io> on 2019/02/05 16:40:51 UTC

Re: Review Request 69680: Have master acknowledge operation updates of completed frameworks.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69680/
-----------------------------------------------------------

(Updated Feb. 5, 2019, 5:40 p.m.)


Review request for mesos, Gastón Kleiman and Greg Mann.


Changes
-------

Fixerize comment as suggested by Greg


Bugs: MESOS-9434
    https://issues.apache.org/jira/browse/MESOS-9434


Repository: mesos


Description
-------

After a framework was removed and has unacknowledged operations status
updates, it was impossible to remove terminal operations as nobody could
acknowledge them.

In this patch we make the master acknowledge operation status updates
for frameworks it knows are removed so that e.g., terminal operations
can be removed. Since masters do not persist completed frameworks this
is not reliable (e.g., an agent was partitioned for a long time and
still tracks a completed framework's `FrameworkInfo`, and comes back
only after the master knowing about the framework's completion has
failed over). We merely extend the existing master behavior (e.g., send
`ShutdownFrameworkMessage` to all currently registered agents) to
operations.


Diffs (updated)
-----

  src/master/master.cpp f74b7c280569e1c24e0940463bb28bd795d429d5 
  src/tests/master_tests.cpp acc6096239e4992bdca084d88880d644ab4a2385 


Diff: https://reviews.apache.org/r/69680/diff/3/

Changes: https://reviews.apache.org/r/69680/diff/2-3/


Testing
-------

* `make check`
* tested on a number of configurations in internal CI
* ran added test in repetition, both with and without additional stress


Thanks,

Benjamin Bannier


Re: Review Request 69680: Have master acknowledge operation updates of completed frameworks.

Posted by Mesos Reviewbot Windows <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69680/#review212565
-----------------------------------------------------------



PASS: Mesos patch 69680 was successfully built and tested.

Reviews applied: `['69854', '69680']`

All the build artifacts available at: http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2853/mesos-review-69680

- Mesos Reviewbot Windows


On Feb. 5, 2019, 5:02 p.m., Benjamin Bannier wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69680/
> -----------------------------------------------------------
> 
> (Updated Feb. 5, 2019, 5:02 p.m.)
> 
> 
> Review request for mesos, Gastón Kleiman and Greg Mann.
> 
> 
> Bugs: MESOS-9434
>     https://issues.apache.org/jira/browse/MESOS-9434
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> After a framework was removed and has unacknowledged operations status
> updates, it was impossible to remove terminal operations as nobody could
> acknowledge them.
> 
> In this patch we make the master acknowledge operation status updates
> for frameworks it knows are removed so that e.g., terminal operations
> can be removed. Since masters do not persist completed frameworks this
> is not reliable (e.g., an agent was partitioned for a long time and
> still tracks a completed framework's `FrameworkInfo`, and comes back
> only after the master knowing about the framework's completion has
> failed over). We merely extend the existing master behavior (e.g., send
> `ShutdownFrameworkMessage` to all currently registered agents) to
> operations.
> 
> 
> Diffs
> -----
> 
>   src/master/master.cpp f74b7c280569e1c24e0940463bb28bd795d429d5 
>   src/tests/master_tests.cpp acc6096239e4992bdca084d88880d644ab4a2385 
> 
> 
> Diff: https://reviews.apache.org/r/69680/diff/3/
> 
> 
> Testing
> -------
> 
> * `make check`
> * tested on a number of configurations in internal CI
> * ran added test in repetition, both with and without additional stress
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>


Re: Review Request 69680: Have master acknowledge operation updates of completed frameworks.

Posted by Benjamin Bannier <be...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69680/
-----------------------------------------------------------

(Updated Feb. 5, 2019, 6:02 p.m.)


Review request for mesos, Gastón Kleiman and Greg Mann.


Bugs: MESOS-9434
    https://issues.apache.org/jira/browse/MESOS-9434


Repository: mesos


Description
-------

After a framework was removed and has unacknowledged operations status
updates, it was impossible to remove terminal operations as nobody could
acknowledge them.

In this patch we make the master acknowledge operation status updates
for frameworks it knows are removed so that e.g., terminal operations
can be removed. Since masters do not persist completed frameworks this
is not reliable (e.g., an agent was partitioned for a long time and
still tracks a completed framework's `FrameworkInfo`, and comes back
only after the master knowing about the framework's completion has
failed over). We merely extend the existing master behavior (e.g., send
`ShutdownFrameworkMessage` to all currently registered agents) to
operations.


Diffs
-----

  src/master/master.cpp f74b7c280569e1c24e0940463bb28bd795d429d5 
  src/tests/master_tests.cpp acc6096239e4992bdca084d88880d644ab4a2385 


Diff: https://reviews.apache.org/r/69680/diff/3/


Testing
-------

* `make check`
* tested on a number of configurations in internal CI
* ran added test in repetition, both with and without additional stress


Thanks,

Benjamin Bannier