You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Benjamin Bannier <bb...@apache.org> on 2019/07/29 21:18:50 UTC

Review Request 71182: Stored last time a drain request was sent in the master.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71182/
-----------------------------------------------------------

Review request for mesos and Greg Mann.


Bugs: MESOS-9907
    https://issues.apache.org/jira/browse/MESOS-9907


Repository: mesos


Description
-------

This information can be used to calculate an approximate time when
draining an agent would be finished, e.g., by comparing
`DrainConfig.max_grace_period` and the field `drain_start_time_seconds`
added here, both obtained from the master API simultaneously. This
information is necessarily approximate as the agent might e.g., fail
over before it has finished draining which would reset the timeout; for
that specific case the master would send the drain request again so
after some time the value calculated from the master API would be in
line with the expected true value.

WIP: drain start time in master


Diffs
-----

  include/mesos/master/master.proto 07bf4e7194399ef660d7282275f25e1d4563e473 
  include/mesos/v1/master/master.proto 9bcef2f358b50fd63cdfdec5dcad742930de3ecc 
  src/common/protobuf_utils.cpp 7778e7f2475e9d6125d1c599715c91715f3654d3 
  src/master/http.cpp 765d5052ab6a8d731113f8922d20fb280b843003 
  src/master/master.hpp 8bdbac90b9000e9126a7fad375642dcf0a03378e 
  src/master/readonly_handler.cpp a8931153dc1b3f204c1b6437b972fe6607f900bd 
  src/tests/api_tests.cpp 641eb15153ddb85df322aa6a133ca8e4c6d6a510 


Diff: https://reviews.apache.org/r/71182/diff/1/


Testing
-------

`make check`


Thanks,

Benjamin Bannier


Re: Review Request 71182: Stored last time a drain request was sent in the master.

Posted by Greg Mann <gr...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71182/#review216997
-----------------------------------------------------------


Fix it, then Ship it!





include/mesos/master/master.proto
Lines 473 (patched)
<https://reviews.apache.org/r/71182/#comment304234>

    Make sure you update the name of this field in the description, and remove the "WIP" line.


- Greg Mann


On July 30, 2019, 12:39 p.m., Benjamin Bannier wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71182/
> -----------------------------------------------------------
> 
> (Updated July 30, 2019, 12:39 p.m.)
> 
> 
> Review request for mesos and Greg Mann.
> 
> 
> Bugs: MESOS-9907
>     https://issues.apache.org/jira/browse/MESOS-9907
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This information can be used to calculate an approximate time when
> draining an agent would be finished, e.g., by comparing
> `DrainConfig.max_grace_period` and the field `drain_start_time_seconds`
> added here, both obtained from the master API simultaneously. This
> information is necessarily approximate as the agent might e.g., fail
> over before it has finished draining which would reset the timeout; for
> that specific case the master would send the drain request again so
> after some time the value calculated from the master API would be in
> line with the expected true value.
> 
> WIP: drain start time in master
> 
> 
> Diffs
> -----
> 
>   include/mesos/master/master.proto 07bf4e7194399ef660d7282275f25e1d4563e473 
>   include/mesos/v1/master/master.proto 9bcef2f358b50fd63cdfdec5dcad742930de3ecc 
>   src/common/protobuf_utils.cpp 7778e7f2475e9d6125d1c599715c91715f3654d3 
>   src/master/http.cpp 765d5052ab6a8d731113f8922d20fb280b843003 
>   src/master/master.hpp 8bdbac90b9000e9126a7fad375642dcf0a03378e 
>   src/master/readonly_handler.cpp a8931153dc1b3f204c1b6437b972fe6607f900bd 
>   src/tests/api_tests.cpp 641eb15153ddb85df322aa6a133ca8e4c6d6a510 
> 
> 
> Diff: https://reviews.apache.org/r/71182/diff/2/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>


Re: Review Request 71182: Stored last time a drain request was sent in the master.

Posted by Benjamin Bannier <bb...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71182/
-----------------------------------------------------------

(Updated Aug. 1, 2019, 9:52 a.m.)


Review request for mesos and Greg Mann.


Bugs: MESOS-9907
    https://issues.apache.org/jira/browse/MESOS-9907


Repository: mesos


Description (updated)
-------

This information can be used to calculate an approximate time when
draining an agent would be finished, e.g., by comparing
`DrainConfig.max_grace_period` and the field
`estimated_drain_start_time` added here, both obtained from the master
API simultaneously. This information is necessarily approximate as the
agent might e.g., fail over before it has finished draining which would
reset the timeout; for that specific case the master would send the
drain request again so after some time the value calculated from the
master API would be in line with the expected true value.

WIP: drain start time in master


Diffs (updated)
-----

  include/mesos/master/master.proto 07bf4e7194399ef660d7282275f25e1d4563e473 
  include/mesos/v1/master/master.proto 9bcef2f358b50fd63cdfdec5dcad742930de3ecc 
  src/common/protobuf_utils.cpp 7778e7f2475e9d6125d1c599715c91715f3654d3 
  src/master/http.cpp 765d5052ab6a8d731113f8922d20fb280b843003 
  src/master/master.hpp 8bdbac90b9000e9126a7fad375642dcf0a03378e 
  src/master/readonly_handler.cpp a8931153dc1b3f204c1b6437b972fe6607f900bd 
  src/tests/api_tests.cpp 641eb15153ddb85df322aa6a133ca8e4c6d6a510 


Diff: https://reviews.apache.org/r/71182/diff/3/

Changes: https://reviews.apache.org/r/71182/diff/2-3/


Testing
-------

`make check`


Thanks,

Benjamin Bannier


Re: Review Request 71182: Stored last time a drain request was sent in the master.

Posted by Benjamin Bannier <bb...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71182/
-----------------------------------------------------------

(Updated July 30, 2019, 2:39 p.m.)


Review request for mesos and Greg Mann.


Changes
-------

Address comments from Greg


Bugs: MESOS-9907
    https://issues.apache.org/jira/browse/MESOS-9907


Repository: mesos


Description
-------

This information can be used to calculate an approximate time when
draining an agent would be finished, e.g., by comparing
`DrainConfig.max_grace_period` and the field `drain_start_time_seconds`
added here, both obtained from the master API simultaneously. This
information is necessarily approximate as the agent might e.g., fail
over before it has finished draining which would reset the timeout; for
that specific case the master would send the drain request again so
after some time the value calculated from the master API would be in
line with the expected true value.

WIP: drain start time in master


Diffs (updated)
-----

  include/mesos/master/master.proto 07bf4e7194399ef660d7282275f25e1d4563e473 
  include/mesos/v1/master/master.proto 9bcef2f358b50fd63cdfdec5dcad742930de3ecc 
  src/common/protobuf_utils.cpp 7778e7f2475e9d6125d1c599715c91715f3654d3 
  src/master/http.cpp 765d5052ab6a8d731113f8922d20fb280b843003 
  src/master/master.hpp 8bdbac90b9000e9126a7fad375642dcf0a03378e 
  src/master/readonly_handler.cpp a8931153dc1b3f204c1b6437b972fe6607f900bd 
  src/tests/api_tests.cpp 641eb15153ddb85df322aa6a133ca8e4c6d6a510 


Diff: https://reviews.apache.org/r/71182/diff/2/

Changes: https://reviews.apache.org/r/71182/diff/1-2/


Testing
-------

`make check`


Thanks,

Benjamin Bannier


Re: Review Request 71182: Stored last time a drain request was sent in the master.

Posted by Greg Mann <gr...@mesosphere.io>.

> On July 30, 2019, 9:50 a.m., Greg Mann wrote:
> > In the description: it's not correct that the timeout would be reset if the agent fails over. Since the grace period is sent to the executor when the first KILL event is set, an agent failover will cause another KILL event to be sent, but the grace period will not be extended.
> > 
> > The timeout _would_ be extended, however, if the agent fails over before it receives the `DrainSlaveMessage`.

Ah sorry, I think you just mean that the time will be reset in the API output.


- Greg


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71182/#review216951
-----------------------------------------------------------


On July 29, 2019, 9:18 p.m., Benjamin Bannier wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71182/
> -----------------------------------------------------------
> 
> (Updated July 29, 2019, 9:18 p.m.)
> 
> 
> Review request for mesos and Greg Mann.
> 
> 
> Bugs: MESOS-9907
>     https://issues.apache.org/jira/browse/MESOS-9907
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This information can be used to calculate an approximate time when
> draining an agent would be finished, e.g., by comparing
> `DrainConfig.max_grace_period` and the field `drain_start_time_seconds`
> added here, both obtained from the master API simultaneously. This
> information is necessarily approximate as the agent might e.g., fail
> over before it has finished draining which would reset the timeout; for
> that specific case the master would send the drain request again so
> after some time the value calculated from the master API would be in
> line with the expected true value.
> 
> WIP: drain start time in master
> 
> 
> Diffs
> -----
> 
>   include/mesos/master/master.proto 07bf4e7194399ef660d7282275f25e1d4563e473 
>   include/mesos/v1/master/master.proto 9bcef2f358b50fd63cdfdec5dcad742930de3ecc 
>   src/common/protobuf_utils.cpp 7778e7f2475e9d6125d1c599715c91715f3654d3 
>   src/master/http.cpp 765d5052ab6a8d731113f8922d20fb280b843003 
>   src/master/master.hpp 8bdbac90b9000e9126a7fad375642dcf0a03378e 
>   src/master/readonly_handler.cpp a8931153dc1b3f204c1b6437b972fe6607f900bd 
>   src/tests/api_tests.cpp 641eb15153ddb85df322aa6a133ca8e4c6d6a510 
> 
> 
> Diff: https://reviews.apache.org/r/71182/diff/1/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>


Re: Review Request 71182: Stored last time a drain request was sent in the master.

Posted by Greg Mann <gr...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71182/#review216951
-----------------------------------------------------------



In the description: it's not correct that the timeout would be reset if the agent fails over. Since the grace period is sent to the executor when the first KILL event is set, an agent failover will cause another KILL event to be sent, but the grace period will not be extended.

The timeout _would_ be extended, however, if the agent fails over before it receives the `DrainSlaveMessage`.


include/mesos/master/master.proto
Lines 473 (patched)
<https://reviews.apache.org/r/71182/#comment304185>

    It would seem a bit more consistent to use `TimeInfo` here, since we have a couple other timestamp fields in this message which use that type. WDYT?



include/mesos/v1/master/master.proto
Lines 474 (patched)
<https://reviews.apache.org/r/71182/#comment304186>

    Consider `estimated_drain_start_time` for this field name, to help communicate that this is a best guess.


- Greg Mann


On July 29, 2019, 9:18 p.m., Benjamin Bannier wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71182/
> -----------------------------------------------------------
> 
> (Updated July 29, 2019, 9:18 p.m.)
> 
> 
> Review request for mesos and Greg Mann.
> 
> 
> Bugs: MESOS-9907
>     https://issues.apache.org/jira/browse/MESOS-9907
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This information can be used to calculate an approximate time when
> draining an agent would be finished, e.g., by comparing
> `DrainConfig.max_grace_period` and the field `drain_start_time_seconds`
> added here, both obtained from the master API simultaneously. This
> information is necessarily approximate as the agent might e.g., fail
> over before it has finished draining which would reset the timeout; for
> that specific case the master would send the drain request again so
> after some time the value calculated from the master API would be in
> line with the expected true value.
> 
> WIP: drain start time in master
> 
> 
> Diffs
> -----
> 
>   include/mesos/master/master.proto 07bf4e7194399ef660d7282275f25e1d4563e473 
>   include/mesos/v1/master/master.proto 9bcef2f358b50fd63cdfdec5dcad742930de3ecc 
>   src/common/protobuf_utils.cpp 7778e7f2475e9d6125d1c599715c91715f3654d3 
>   src/master/http.cpp 765d5052ab6a8d731113f8922d20fb280b843003 
>   src/master/master.hpp 8bdbac90b9000e9126a7fad375642dcf0a03378e 
>   src/master/readonly_handler.cpp a8931153dc1b3f204c1b6437b972fe6607f900bd 
>   src/tests/api_tests.cpp 641eb15153ddb85df322aa6a133ca8e4c6d6a510 
> 
> 
> Diff: https://reviews.apache.org/r/71182/diff/1/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>