You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Greg Mann <gr...@mesosphere.io> on 2019/01/31 23:08:13 UTC

Review Request 69876: Removed operations from master state when an agent is downgraded.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69876/
-----------------------------------------------------------

Review request for mesos, Benjamin Bannier, Chun-Hung Hsiao, and Gastón Kleiman.


Bugs: MESOS-9535
    https://issues.apache.org/jira/browse/MESOS-9535


Repository: mesos


Description
-------

When an agent is downgraded from one with the AGENT_OPERATION_FEEDBACK
capability to one without this capability, the master needs to remove
terminal-but-unACKed operations from its state which operate on agent
default resources, since the downgraded agent will not resend status
updates for these operations.


Diffs
-----

  src/master/master.cpp f74b7c280569e1c24e0940463bb28bd795d429d5 
  src/tests/master_tests.cpp acc6096239e4992bdca084d88880d644ab4a2385 


Diff: https://reviews.apache.org/r/69876/diff/1/


Testing
-------

`make check`
`bin/mesos-tests.sh --gtest_filter="*CleanupOperationsAfterAgentDowngrade*" --gtest_repeat=-1 --gtest_break_on_failure`


Thanks,

Greg Mann


Re: Review Request 69876: Removed operations from master state when an agent is downgraded.

Posted by Mesos Reviewbot Windows <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69876/#review212491
-----------------------------------------------------------



PASS: Mesos patch 69876 was successfully built and tested.

Reviews applied: `['69876']`

All the build artifacts available at: http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2842/mesos-review-69876

- Mesos Reviewbot Windows


On Jan. 31, 2019, 11:08 p.m., Greg Mann wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69876/
> -----------------------------------------------------------
> 
> (Updated Jan. 31, 2019, 11:08 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Chun-Hung Hsiao, and Gastón Kleiman.
> 
> 
> Bugs: MESOS-9535
>     https://issues.apache.org/jira/browse/MESOS-9535
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> When an agent is downgraded from one with the AGENT_OPERATION_FEEDBACK
> capability to one without this capability, the master needs to remove
> terminal-but-unACKed operations from its state which operate on agent
> default resources, since the downgraded agent will not resend status
> updates for these operations.
> 
> 
> Diffs
> -----
> 
>   src/master/master.cpp f74b7c280569e1c24e0940463bb28bd795d429d5 
>   src/tests/master_tests.cpp acc6096239e4992bdca084d88880d644ab4a2385 
> 
> 
> Diff: https://reviews.apache.org/r/69876/diff/1/
> 
> 
> Testing
> -------
> 
> `make check`
> `bin/mesos-tests.sh --gtest_filter="*CleanupOperationsAfterAgentDowngrade*" --gtest_repeat=-1 --gtest_break_on_failure`
> 
> 
> Thanks,
> 
> Greg Mann
> 
>


Re: Review Request 69876: Removed operations from master state when an agent is downgraded.

Posted by Gastón Kleiman <ga...@mesosphere.io>.

> On Feb. 4, 2019, 4:28 p.m., Gastón Kleiman wrote:
> > src/tests/master_tests.cpp
> > Lines 9419 (patched)
> > <https://reviews.apache.org/r/69876/diff/1/?file=2123554#file2123554line9419>
> >
> >     We should consider making the agent not recover the operation status update manager if it isn't started with the `AGENT_OPERATION_FEEDBACK` capability.

If we don't, we should not drop this message and make sure that the framework can acknowledge the update, so that the agent stops resending it.


- Gastón


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69876/#review212538
-----------------------------------------------------------


On Jan. 31, 2019, 3:08 p.m., Greg Mann wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69876/
> -----------------------------------------------------------
> 
> (Updated Jan. 31, 2019, 3:08 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Chun-Hung Hsiao, and Gastón Kleiman.
> 
> 
> Bugs: MESOS-9535
>     https://issues.apache.org/jira/browse/MESOS-9535
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> When an agent is downgraded from one with the AGENT_OPERATION_FEEDBACK
> capability to one without this capability, the master needs to remove
> terminal-but-unACKed operations from its state which operate on agent
> default resources, since the downgraded agent will not resend status
> updates for these operations.
> 
> 
> Diffs
> -----
> 
>   src/master/master.cpp f74b7c280569e1c24e0940463bb28bd795d429d5 
>   src/tests/master_tests.cpp acc6096239e4992bdca084d88880d644ab4a2385 
> 
> 
> Diff: https://reviews.apache.org/r/69876/diff/1/
> 
> 
> Testing
> -------
> 
> `make check`
> `bin/mesos-tests.sh --gtest_filter="*CleanupOperationsAfterAgentDowngrade*" --gtest_repeat=-1 --gtest_break_on_failure`
> 
> 
> Thanks,
> 
> Greg Mann
> 
>


Re: Review Request 69876: Removed operations from master state when an agent is downgraded.

Posted by Greg Mann <gr...@mesosphere.io>.

> On Feb. 5, 2019, 12:28 a.m., Gastón Kleiman wrote:
> > src/tests/master_tests.cpp
> > Lines 9419 (patched)
> > <https://reviews.apache.org/r/69876/diff/1/?file=2123554#file2123554line9419>
> >
> >     We should consider making the agent not recover the operation status update manager if it isn't started with the `AGENT_OPERATION_FEEDBACK` capability.
> 
> Gastón Kleiman wrote:
>     If we don't, we should not drop this message and make sure that the framework can acknowledge the update, so that the agent stops resending it.

I created a ticket to track this work: https://issues.apache.org/jira/browse/MESOS-9561


- Greg


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69876/#review212538
-----------------------------------------------------------


On Jan. 31, 2019, 11:08 p.m., Greg Mann wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69876/
> -----------------------------------------------------------
> 
> (Updated Jan. 31, 2019, 11:08 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Chun-Hung Hsiao, and Gastón Kleiman.
> 
> 
> Bugs: MESOS-9535
>     https://issues.apache.org/jira/browse/MESOS-9535
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> When an agent is downgraded from one with the AGENT_OPERATION_FEEDBACK
> capability to one without this capability, the master needs to remove
> terminal-but-unACKed operations from its state which operate on agent
> default resources, since the downgraded agent will not resend status
> updates for these operations.
> 
> 
> Diffs
> -----
> 
>   src/master/master.cpp f74b7c280569e1c24e0940463bb28bd795d429d5 
>   src/tests/master_tests.cpp acc6096239e4992bdca084d88880d644ab4a2385 
> 
> 
> Diff: https://reviews.apache.org/r/69876/diff/1/
> 
> 
> Testing
> -------
> 
> `make check`
> `bin/mesos-tests.sh --gtest_filter="*CleanupOperationsAfterAgentDowngrade*" --gtest_repeat=-1 --gtest_break_on_failure`
> 
> 
> Thanks,
> 
> Greg Mann
> 
>


Re: Review Request 69876: Removed operations from master state when an agent is downgraded.

Posted by Greg Mann <gr...@mesosphere.io>.

> On Feb. 5, 2019, 12:28 a.m., Gastón Kleiman wrote:
> > src/tests/master_tests.cpp
> > Lines 9419 (patched)
> > <https://reviews.apache.org/r/69876/diff/1/?file=2123554#file2123554line9419>
> >
> >     We should consider making the agent not recover the operation status update manager if it isn't started with the `AGENT_OPERATION_FEEDBACK` capability.
> 
> Gastón Kleiman wrote:
>     If we don't, we should not drop this message and make sure that the framework can acknowledge the update, so that the agent stops resending it.
> 
> Greg Mann wrote:
>     I created a ticket to track this work: https://issues.apache.org/jira/browse/MESOS-9561
> 
> Greg Mann wrote:
>     Unfortunately I need to remove this test entirely since I'm making the AGENT_OPERATION_FEEDBACK capability required for agent startup in https://reviews.apache.org/r/69958/

To address the original comment: as we discussed offline, it seems reasonable to let the agent recover the operation SUM when started without the new capability, since this will allow it to keep sending updates for operations submitted while the capability was enabled. The master will simply refuse to forward future operations to the agent which request feedback for agent default resources.


- Greg


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69876/#review212538
-----------------------------------------------------------


On Feb. 12, 2019, 9:42 p.m., Greg Mann wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69876/
> -----------------------------------------------------------
> 
> (Updated Feb. 12, 2019, 9:42 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Chun-Hung Hsiao, and Gastón Kleiman.
> 
> 
> Bugs: MESOS-9535
>     https://issues.apache.org/jira/browse/MESOS-9535
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> When an agent is downgraded from one with the AGENT_OPERATION_FEEDBACK
> capability to one without this capability, the master needs to remove
> terminal-but-unACKed operations from its state which operate on agent
> default resources, since the downgraded agent will not resend status
> updates for these operations.
> 
> 
> Diffs
> -----
> 
>   src/master/master.cpp cf2210ec26642028d5e4fb7fc1841eb0a1ed3396 
> 
> 
> Diff: https://reviews.apache.org/r/69876/diff/3/
> 
> 
> Testing
> -------
> 
> `make check`
> `bin/mesos-tests.sh --gtest_filter="*CleanupOperationsAfterAgentDowngrade*" --gtest_repeat=-1 --gtest_break_on_failure`
> 
> 
> Thanks,
> 
> Greg Mann
> 
>


Re: Review Request 69876: Removed operations from master state when an agent is downgraded.

Posted by Greg Mann <gr...@mesosphere.io>.

> On Feb. 5, 2019, 12:28 a.m., Gastón Kleiman wrote:
> > src/tests/master_tests.cpp
> > Lines 9419 (patched)
> > <https://reviews.apache.org/r/69876/diff/1/?file=2123554#file2123554line9419>
> >
> >     We should consider making the agent not recover the operation status update manager if it isn't started with the `AGENT_OPERATION_FEEDBACK` capability.
> 
> Gastón Kleiman wrote:
>     If we don't, we should not drop this message and make sure that the framework can acknowledge the update, so that the agent stops resending it.
> 
> Greg Mann wrote:
>     I created a ticket to track this work: https://issues.apache.org/jira/browse/MESOS-9561

Unfortunately I need to remove this test entirely since I'm making the AGENT_OPERATION_FEEDBACK capability required for agent startup in https://reviews.apache.org/r/69958/


- Greg


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69876/#review212538
-----------------------------------------------------------


On Jan. 31, 2019, 11:08 p.m., Greg Mann wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69876/
> -----------------------------------------------------------
> 
> (Updated Jan. 31, 2019, 11:08 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Chun-Hung Hsiao, and Gastón Kleiman.
> 
> 
> Bugs: MESOS-9535
>     https://issues.apache.org/jira/browse/MESOS-9535
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> When an agent is downgraded from one with the AGENT_OPERATION_FEEDBACK
> capability to one without this capability, the master needs to remove
> terminal-but-unACKed operations from its state which operate on agent
> default resources, since the downgraded agent will not resend status
> updates for these operations.
> 
> 
> Diffs
> -----
> 
>   src/master/master.cpp f74b7c280569e1c24e0940463bb28bd795d429d5 
>   src/tests/master_tests.cpp acc6096239e4992bdca084d88880d644ab4a2385 
> 
> 
> Diff: https://reviews.apache.org/r/69876/diff/1/
> 
> 
> Testing
> -------
> 
> `make check`
> `bin/mesos-tests.sh --gtest_filter="*CleanupOperationsAfterAgentDowngrade*" --gtest_repeat=-1 --gtest_break_on_failure`
> 
> 
> Thanks,
> 
> Greg Mann
> 
>


Re: Review Request 69876: Removed operations from master state when an agent is downgraded.

Posted by Gastón Kleiman <ga...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69876/#review212538
-----------------------------------------------------------


Fix it, then Ship it!





src/tests/master_tests.cpp
Lines 9419 (patched)
<https://reviews.apache.org/r/69876/#comment298353>

    We should consider making the agent not recover the operation status update manager if it isn't started with the `AGENT_OPERATION_FEEDBACK` capability.


- Gastón Kleiman


On Jan. 31, 2019, 3:08 p.m., Greg Mann wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69876/
> -----------------------------------------------------------
> 
> (Updated Jan. 31, 2019, 3:08 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Chun-Hung Hsiao, and Gastón Kleiman.
> 
> 
> Bugs: MESOS-9535
>     https://issues.apache.org/jira/browse/MESOS-9535
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> When an agent is downgraded from one with the AGENT_OPERATION_FEEDBACK
> capability to one without this capability, the master needs to remove
> terminal-but-unACKed operations from its state which operate on agent
> default resources, since the downgraded agent will not resend status
> updates for these operations.
> 
> 
> Diffs
> -----
> 
>   src/master/master.cpp f74b7c280569e1c24e0940463bb28bd795d429d5 
>   src/tests/master_tests.cpp acc6096239e4992bdca084d88880d644ab4a2385 
> 
> 
> Diff: https://reviews.apache.org/r/69876/diff/1/
> 
> 
> Testing
> -------
> 
> `make check`
> `bin/mesos-tests.sh --gtest_filter="*CleanupOperationsAfterAgentDowngrade*" --gtest_repeat=-1 --gtest_break_on_failure`
> 
> 
> Thanks,
> 
> Greg Mann
> 
>


Re: Review Request 69876: Removed operations from master state when an agent is downgraded.

Posted by Mesos Reviewbot Windows <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69876/#review212768
-----------------------------------------------------------



FAIL: Failed to get dependent review IDs for the current patch.

Failed command: `python.exe D:\DCOS\mesos\mesos\support\get-review-ids.py -r 69876 -o C:\Users\jenkins\AppData\Local\Temp\mesos_dependent_review_ids`

All the build artifacts available at: http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2878/mesos-review-69876

Relevant logs:

- [get-review-ids.log](http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2878/mesos-review-69876/logs/get-review-ids.log):

```
Dependent review: https://reviews.apache.org/api/review-requests/69957/
Error handling URL https://reviews.apache.org/api/review-requests/69957/: HTTP Error 401: UNAUTHORIZED
Traceback (most recent call last):
  File "D:\DCOS\mesos\mesos\support\get-review-ids.py", line 62, in <module>
    main()
  File "D:\DCOS\mesos\mesos\support\get-review-ids.py", line 51, in main
    review_ids = handler.get_dependent_review_ids(review_request)
  File "D:\DCOS\mesos\mesos\support\common.py", line 94, in get_dependent_review_ids
    self._review_ids(review_request, review_ids)
  File "D:\DCOS\mesos\mesos\support\common.py", line 58, in _review_ids
    dependent_review = self.api(review_url)["review_request"]
  File "D:\DCOS\mesos\mesos\support\common.py", line 80, in api
    return json.loads(urllib2.urlopen(url, data=data).read().decode(
  File "C:\Program Files\Python36\lib\urllib\request.py", line 223, in urlopen
    return opener.open(url, data, timeout)
  File "C:\Program Files\Python36\lib\urllib\request.py", line 532, in open
    response = meth(req, response)
  File "C:\Program Files\Python36\lib\urllib\request.py", line 642, in http_response
    'http', request, response, code, msg, hdrs)
  File "C:\Program Files\Python36\lib\urllib\request.py", line 570, in error
    return self._call_chain(*args)
  File "C:\Program Files\Python36\lib\urllib\request.py", line 504, in _call_chain
    result = func(*args)
  File "C:\Program Files\Python36\lib\urllib\request.py", line 650, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 401: UNAUTHORIZED
```

- Mesos Reviewbot Windows


On Feb. 12, 2019, 9:42 p.m., Greg Mann wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69876/
> -----------------------------------------------------------
> 
> (Updated Feb. 12, 2019, 9:42 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Chun-Hung Hsiao, and Gastón Kleiman.
> 
> 
> Bugs: MESOS-9535
>     https://issues.apache.org/jira/browse/MESOS-9535
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> When an agent is downgraded from one with the AGENT_OPERATION_FEEDBACK
> capability to one without this capability, the master needs to remove
> terminal-but-unACKed operations from its state which operate on agent
> default resources, since the downgraded agent will not resend status
> updates for these operations.
> 
> 
> Diffs
> -----
> 
>   src/master/master.cpp cf2210ec26642028d5e4fb7fc1841eb0a1ed3396 
> 
> 
> Diff: https://reviews.apache.org/r/69876/diff/3/
> 
> 
> Testing
> -------
> 
> `make check`
> `bin/mesos-tests.sh --gtest_filter="*CleanupOperationsAfterAgentDowngrade*" --gtest_repeat=-1 --gtest_break_on_failure`
> 
> 
> Thanks,
> 
> Greg Mann
> 
>


Re: Review Request 69876: Removed operations from master state when an agent is downgraded.

Posted by Greg Mann <gr...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69876/
-----------------------------------------------------------

(Updated Feb. 12, 2019, 11:39 p.m.)


Review request for mesos, Benjamin Bannier, Chun-Hung Hsiao, and Gastón Kleiman.


Bugs: MESOS-9535
    https://issues.apache.org/jira/browse/MESOS-9535


Repository: mesos


Description
-------

When an agent is downgraded from one with the AGENT_OPERATION_FEEDBACK
capability to one without this capability, the master needs to remove
terminal-but-unACKed operations from its state which operate on agent
default resources, since the downgraded agent will not resend status
updates for these operations.


Diffs
-----

  src/master/master.cpp cf2210ec26642028d5e4fb7fc1841eb0a1ed3396 


Diff: https://reviews.apache.org/r/69876/diff/3/


Testing (updated)
-------

Testing details at the end of this chain.


Thanks,

Greg Mann


Re: Review Request 69876: Removed operations from master state when an agent is downgraded.

Posted by Greg Mann <gr...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69876/
-----------------------------------------------------------

(Updated Feb. 12, 2019, 9:42 p.m.)


Review request for mesos, Benjamin Bannier, Chun-Hung Hsiao, and Gastón Kleiman.


Bugs: MESOS-9535
    https://issues.apache.org/jira/browse/MESOS-9535


Repository: mesos


Description
-------

When an agent is downgraded from one with the AGENT_OPERATION_FEEDBACK
capability to one without this capability, the master needs to remove
terminal-but-unACKed operations from its state which operate on agent
default resources, since the downgraded agent will not resend status
updates for these operations.


Diffs (updated)
-----

  src/master/master.cpp cf2210ec26642028d5e4fb7fc1841eb0a1ed3396 


Diff: https://reviews.apache.org/r/69876/diff/3/

Changes: https://reviews.apache.org/r/69876/diff/2-3/


Testing
-------

`make check`
`bin/mesos-tests.sh --gtest_filter="*CleanupOperationsAfterAgentDowngrade*" --gtest_repeat=-1 --gtest_break_on_failure`


Thanks,

Greg Mann


Re: Review Request 69876: Removed operations from master state when an agent is downgraded.

Posted by Greg Mann <gr...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69876/
-----------------------------------------------------------

(Updated Feb. 12, 2019, 9:36 p.m.)


Review request for mesos, Benjamin Bannier, Chun-Hung Hsiao, and Gastón Kleiman.


Bugs: MESOS-9535
    https://issues.apache.org/jira/browse/MESOS-9535


Repository: mesos


Description
-------

When an agent is downgraded from one with the AGENT_OPERATION_FEEDBACK
capability to one without this capability, the master needs to remove
terminal-but-unACKed operations from its state which operate on agent
default resources, since the downgraded agent will not resend status
updates for these operations.


Diffs (updated)
-----

  src/master/master.cpp cf2210ec26642028d5e4fb7fc1841eb0a1ed3396 


Diff: https://reviews.apache.org/r/69876/diff/2/

Changes: https://reviews.apache.org/r/69876/diff/1-2/


Testing
-------

`make check`
`bin/mesos-tests.sh --gtest_filter="*CleanupOperationsAfterAgentDowngrade*" --gtest_repeat=-1 --gtest_break_on_failure`


Thanks,

Greg Mann