You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Megha Sharma <ms...@apple.com> on 2017/09/11 21:23:32 UTC
Re: Review Request 61473: Do not kill non partition aware tasks.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61473/
-----------------------------------------------------------
(Updated Sept. 11, 2017, 9:23 p.m.)
Review request for mesos, Vinod Kone and Jiang Yan Xu.
Bugs: MESOS-7215
https://issues.apache.org/jira/browse/MESOS-7215
Repository: mesos
Description
-------
Master will not kill the tasks for non-Partition aware frameworks
when an unreachable agent re-registers with the master.
Master used to send a ShutdownFrameworkMessages to the agent
to kill the tasks from non partition aware frameworks including the
ones that are still registered which was problematic because the offer
from this agent could still go to the same framework which could then
launch new tasks. The agent would then receive tasks of the same
framework and ignore them because it thinks the framework is shutting
down. The framework is not shutting down of course, so from the master
and the scheduler’s perspective the task is pending in STAGING forever
until the next agent reregistration, which could happen much later.
This commit fixes the problem by not shutting down the non-partition
aware frameworks on such an agent.
Diffs (updated)
-----
src/master/http.cpp 28d0393fb5962df4d731521265efd81a54e1e655
src/master/master.hpp 05f88111afb4fa0e2baf57106e1479914c16a113
src/master/master.cpp 6d84a26bff970b842b58dfb69dbf232ba5c16a20
src/tests/partition_tests.cpp 0886f4890ac3fec6f38146946892769a99c3e68f
Diff: https://reviews.apache.org/r/61473/diff/7/
Changes: https://reviews.apache.org/r/61473/diff/6-7/
Testing
-------
make check
Thanks,
Megha Sharma
Re: Review Request 61473: Do not kill non partition aware tasks.
Posted by Mesos Reviewbot Windows <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61473/#review185268
-----------------------------------------------------------
FAIL: Failed to apply the current review.
Failed command: `python.exe .\support\apply-reviews.py -n -r 61473`
All the build artifacts available at: http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/61473
Relevant logs:
- [apply-review-61473-stderr.log](http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/61473/logs/apply-review-61473-stderr.log):
```
Traceback (most recent call last):
File ".\support\apply-reviews.py", line 417, in <module>
main()
File ".\support\apply-reviews.py", line 412, in main
reviewboard(options)
File ".\support\apply-reviews.py", line 402, in reviewboard
apply_review(options)
File ".\support\apply-reviews.py", line 160, in apply_review
commit_patch(options)
File ".\support\apply-reviews.py", line 261, in commit_patch
message.write(data['message'])
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in position 655: ordinal not in range(128)
```
- Mesos Reviewbot Windows
On Sept. 11, 2017, 9:23 p.m., Megha Sharma wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61473/
> -----------------------------------------------------------
>
> (Updated Sept. 11, 2017, 9:23 p.m.)
>
>
> Review request for mesos, Vinod Kone and Jiang Yan Xu.
>
>
> Bugs: MESOS-7215
> https://issues.apache.org/jira/browse/MESOS-7215
>
>
> Repository: mesos
>
>
> Description
> -------
>
> Master will not kill the tasks for non-Partition aware frameworks
> when an unreachable agent re-registers with the master.
> Master used to send a ShutdownFrameworkMessages to the agent
> to kill the tasks from non partition aware frameworks including the
> ones that are still registered which was problematic because the offer
> from this agent could still go to the same framework which could then
> launch new tasks. The agent would then receive tasks of the same
> framework and ignore them because it thinks the framework is shutting
> down. The framework is not shutting down of course, so from the master
> and the scheduler’s perspective the task is pending in STAGING forever
> until the next agent reregistration, which could happen much later.
> This commit fixes the problem by not shutting down the non-partition
> aware frameworks on such an agent.
>
>
> Diffs
> -----
>
> src/master/http.cpp 28d0393fb5962df4d731521265efd81a54e1e655
> src/master/master.hpp 05f88111afb4fa0e2baf57106e1479914c16a113
> src/master/master.cpp 6d84a26bff970b842b58dfb69dbf232ba5c16a20
> src/tests/partition_tests.cpp 0886f4890ac3fec6f38146946892769a99c3e68f
>
>
> Diff: https://reviews.apache.org/r/61473/diff/7/
>
>
> Testing
> -------
>
> make check
>
>
> Thanks,
>
> Megha Sharma
>
>