You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@aurora.apache.org by Santhosh Kumar Shanmugham <sa...@gmail.com> on 2016/12/02 08:43:49 UTC
Review Request 54299: Extend warm-up time by
`max_consecutive_failures` attempts.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54299/
-----------------------------------------------------------
Review request for Aurora, David McLaughlin, Joshua Cohen, Stephan Erb, and Zameer Manji.
Bugs: AURORA-1841
https://issues.apache.org/jira/browse/AURORA-1841
Repository: aurora
Description
-------
It is possible to set the health checks such that a task can
continually fail health checks with intermittent successes and still
succeed an update. Essentially a task fails health checks during the
`initial_interval_secs` and an additional `max_consecutive_failures`,
and then perform a successful health check to become healthy.
To be backward compatible to the above configuration, include the
`max_consecutive_failures` when computing `max_attempts_to_running`.
Diffs
-----
docs/features/services.md 50189eeff26ce9614d092f6abd9246788647fe2b
src/main/python/apache/aurora/executor/common/health_checker.py 12af9d8635a553eabe918a86508aa6ce2fd78a49
src/test/python/apache/aurora/executor/common/test_health_checker.py e2a7f164a24f49dd1f4cdba136e838b9d42d73a2
Diff: https://reviews.apache.org/r/54299/diff/
Testing
-------
build-support/jenkins/build.sh
src/test/sh/org/apacher/aurora/e2e/test_end_to_end.sh
Thanks,
Santhosh Kumar Shanmugham
Re: Review Request 54299: Extend warm-up time by
`max_consecutive_failures` attempts.
Posted by Santhosh Kumar Shanmugham <sa...@gmail.com>.
> On Dec. 2, 2016, 11:54 a.m., Zameer Manji wrote:
> > It took me a long time to understand this after staring at the tests, but I think this is correct.
> >
> > This is unfortunately a little complex to understand. For bonus points, would it be possible to encode some of this information in a diagram?
> >
> > The tests are thourough, which makes me comfortable in shipping this change.
https://docs.google.com/document/d/1KOO0LC046k75TqQqJ4c0FQcVGbxvrn71E10wAjMorVY/edit?usp=sharing
- Santhosh Kumar
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54299/#review157807
-----------------------------------------------------------
On Dec. 2, 2016, 12:43 a.m., Santhosh Kumar Shanmugham wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54299/
> -----------------------------------------------------------
>
> (Updated Dec. 2, 2016, 12:43 a.m.)
>
>
> Review request for Aurora, David McLaughlin, Joshua Cohen, Stephan Erb, and Zameer Manji.
>
>
> Bugs: AURORA-1841
> https://issues.apache.org/jira/browse/AURORA-1841
>
>
> Repository: aurora
>
>
> Description
> -------
>
> It is possible to set the health checks such that a task can
> continually fail health checks with intermittent successes and still
> succeed an update. Essentially a task fails health checks during the
> `initial_interval_secs` and an additional `max_consecutive_failures`,
> and then perform a successful health check to become healthy.
>
> To be backward compatible to the above configuration, include the
> `max_consecutive_failures` when computing `max_attempts_to_running`.
>
>
> Diffs
> -----
>
> docs/features/services.md 50189eeff26ce9614d092f6abd9246788647fe2b
> src/main/python/apache/aurora/executor/common/health_checker.py 12af9d8635a553eabe918a86508aa6ce2fd78a49
> src/test/python/apache/aurora/executor/common/test_health_checker.py e2a7f164a24f49dd1f4cdba136e838b9d42d73a2
>
> Diff: https://reviews.apache.org/r/54299/diff/
>
>
> Testing
> -------
>
> build-support/jenkins/build.sh
> src/test/sh/org/apacher/aurora/e2e/test_end_to_end.sh
>
>
> Thanks,
>
> Santhosh Kumar Shanmugham
>
>
Re: Review Request 54299: Extend warm-up time by
`max_consecutive_failures` attempts.
Posted by Zameer Manji <zm...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54299/#review157807
-----------------------------------------------------------
Ship it!
It took me a long time to understand this after staring at the tests, but I think this is correct.
This is unfortunately a little complex to understand. For bonus points, would it be possible to encode some of this information in a diagram?
The tests are thourough, which makes me comfortable in shipping this change.
- Zameer Manji
On Dec. 2, 2016, 12:43 a.m., Santhosh Kumar Shanmugham wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54299/
> -----------------------------------------------------------
>
> (Updated Dec. 2, 2016, 12:43 a.m.)
>
>
> Review request for Aurora, David McLaughlin, Joshua Cohen, Stephan Erb, and Zameer Manji.
>
>
> Bugs: AURORA-1841
> https://issues.apache.org/jira/browse/AURORA-1841
>
>
> Repository: aurora
>
>
> Description
> -------
>
> It is possible to set the health checks such that a task can
> continually fail health checks with intermittent successes and still
> succeed an update. Essentially a task fails health checks during the
> `initial_interval_secs` and an additional `max_consecutive_failures`,
> and then perform a successful health check to become healthy.
>
> To be backward compatible to the above configuration, include the
> `max_consecutive_failures` when computing `max_attempts_to_running`.
>
>
> Diffs
> -----
>
> docs/features/services.md 50189eeff26ce9614d092f6abd9246788647fe2b
> src/main/python/apache/aurora/executor/common/health_checker.py 12af9d8635a553eabe918a86508aa6ce2fd78a49
> src/test/python/apache/aurora/executor/common/test_health_checker.py e2a7f164a24f49dd1f4cdba136e838b9d42d73a2
>
> Diff: https://reviews.apache.org/r/54299/diff/
>
>
> Testing
> -------
>
> build-support/jenkins/build.sh
> src/test/sh/org/apacher/aurora/e2e/test_end_to_end.sh
>
>
> Thanks,
>
> Santhosh Kumar Shanmugham
>
>
Re: Review Request 54299: Extend warm-up time by
`max_consecutive_failures` attempts.
Posted by Santhosh Kumar Shanmugham <sa...@gmail.com>.
> On Dec. 2, 2016, 1:44 p.m., Joshua Cohen wrote:
> > src/main/python/apache/aurora/executor/common/health_checker.py, lines 115-117
> > <https://reviews.apache.org/r/54299/diff/1/?file=1574585#file1574585line115>
> >
> > There still exists the chance for a backwards incompatibility here. Under the previous watch-driven updates, a task could flip between failing and successful health checks, and as long as it's still running at the end of `watch_secs` the updater would consider it healthy and move on. With this new behavior, someone could configure a task in such a way that the max attempts are consumed without reaching `max_consecutive_failures` or `min_consecutive_successes` before `watch_secs` is elapsed, meaning that the task would fail.
> >
> > As we discussed earlier, if we make `watch_secs` and `min_consecutive_successes` mutually exclusive in the client, then the executor could only trigger the new behavior if the user opted in by setting `watch_secs` to 0 and `min_consecutive_successes` to non-zero.
I believe that the situation you are describing would occur only when `min_consecutive_successes > 1`, which means that user has already opted in for the new behavior.
#Old behavior:#
#*With `watch_secs`*#
Task starts in `RUNNING` state. Task has to report atleast 1 success within the first `initial_interval_secs + max_consecutive_failures * interval_secs` (no health checks are done during the `initial_interval_secs`, hence it means no `max_consecutive_failures + 1`). Following this, the task must report atleast 1 success after every `max_consecutive_failures` to remain in `RUNNING`, until `watch_secs` expires.
#*Without `watch_secs`#*
Task starts in `RUNNING` state. Task has to report atleast 1 success within the first `initial_interval_secs + max_consecutive_failures * interval_secs` (no health checks are done during the `initial_interval_secs`, hence it means no `max_consecutive_failures + 1`).
#New behavior:#
#*With `watch_secs`*#
Task has to report atleast `min_consecutive_successes` (default=1) within the first `initial_interval_secs + (max_consecutive_failures + min_consecutive_successes) * interval_secs` to move to `RUNNING` state. Following this, the task must report atleast 1 success after every `max_consecutive_failures` to remain in `RUNNING`, until `watch_secs` expires.
#*Without `watch_secs`#*
Task has to report atleast `min_consecutive_successes` (default=1) within the first `initial_interval_secs + (max_consecutive_failures + min_consecutive_successes) * interval_secs` to move to `RUNNING` state.
Once in `RUNNING`, `min_consecutive_successes` is irrelevant, since the only transition possible is from `RUNNING` to a terminal state. Hence it is enough for a task to report just 1 successes every `max_consecutive_failures` to remain healthy. One might argue that `min_consecutive_successes` is not at all necessary in the first place. On the other hand once can argue that, this will serve as a replacement mechanism in-place of `watch_secs` to enforce tighter healthiness conditions before treating a task as successfully updated, thereby avoiding bad updates from succeeding.
All in all, setting `min_consecutive_successes` to 1 as the default should provide us with the necessary backward-compatibility.
Please refer to the diagrams in the design document. https://docs.google.com/document/d/1KOO0LC046k75TqQqJ4c0FQcVGbxvrn71E10wAjMorVY/edit?usp=sharing
> On Dec. 2, 2016, 1:44 p.m., Joshua Cohen wrote:
> > src/main/python/apache/aurora/executor/common/health_checker.py, line 113
> > <https://reviews.apache.org/r/54299/diff/1/?file=1574585#file1574585line113>
> >
> > s/suppose/supposed
Done.
- Santhosh Kumar
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54299/#review157764
-----------------------------------------------------------
On Dec. 2, 2016, 12:43 a.m., Santhosh Kumar Shanmugham wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54299/
> -----------------------------------------------------------
>
> (Updated Dec. 2, 2016, 12:43 a.m.)
>
>
> Review request for Aurora, David McLaughlin, Joshua Cohen, Stephan Erb, and Zameer Manji.
>
>
> Bugs: AURORA-1841
> https://issues.apache.org/jira/browse/AURORA-1841
>
>
> Repository: aurora
>
>
> Description
> -------
>
> It is possible to set the health checks such that a task can
> continually fail health checks with intermittent successes and still
> succeed an update. Essentially a task fails health checks during the
> `initial_interval_secs` and an additional `max_consecutive_failures`,
> and then perform a successful health check to become healthy.
>
> To be backward compatible to the above configuration, include the
> `max_consecutive_failures` when computing `max_attempts_to_running`.
>
>
> Diffs
> -----
>
> docs/features/services.md 50189eeff26ce9614d092f6abd9246788647fe2b
> src/main/python/apache/aurora/executor/common/health_checker.py 12af9d8635a553eabe918a86508aa6ce2fd78a49
> src/test/python/apache/aurora/executor/common/test_health_checker.py e2a7f164a24f49dd1f4cdba136e838b9d42d73a2
>
> Diff: https://reviews.apache.org/r/54299/diff/
>
>
> Testing
> -------
>
> build-support/jenkins/build.sh
> src/test/sh/org/apacher/aurora/e2e/test_end_to_end.sh
>
>
> Thanks,
>
> Santhosh Kumar Shanmugham
>
>
Re: Review Request 54299: Extend warm-up time by
`max_consecutive_failures` attempts.
Posted by Joshua Cohen <jc...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54299/#review157764
-----------------------------------------------------------
src/main/python/apache/aurora/executor/common/health_checker.py (line 113)
<https://reviews.apache.org/r/54299/#comment228376>
s/suppose/supposed
src/main/python/apache/aurora/executor/common/health_checker.py (lines 115 - 117)
<https://reviews.apache.org/r/54299/#comment228434>
There still exists the chance for a backwards incompatibility here. Under the previous watch-driven updates, a task could flip between failing and successful health checks, and as long as it's still running at the end of `watch_secs` the updater would consider it healthy and move on. With this new behavior, someone could configure a task in such a way that the max attempts are consumed without reaching `max_consecutive_failures` or `min_consecutive_successes` before `watch_secs` is elapsed, meaning that the task would fail.
As we discussed earlier, if we make `watch_secs` and `min_consecutive_successes` mutually exclusive in the client, then the executor could only trigger the new behavior if the user opted in by setting `watch_secs` to 0 and `min_consecutive_successes` to non-zero.
- Joshua Cohen
On Dec. 2, 2016, 8:43 a.m., Santhosh Kumar Shanmugham wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54299/
> -----------------------------------------------------------
>
> (Updated Dec. 2, 2016, 8:43 a.m.)
>
>
> Review request for Aurora, David McLaughlin, Joshua Cohen, Stephan Erb, and Zameer Manji.
>
>
> Bugs: AURORA-1841
> https://issues.apache.org/jira/browse/AURORA-1841
>
>
> Repository: aurora
>
>
> Description
> -------
>
> It is possible to set the health checks such that a task can
> continually fail health checks with intermittent successes and still
> succeed an update. Essentially a task fails health checks during the
> `initial_interval_secs` and an additional `max_consecutive_failures`,
> and then perform a successful health check to become healthy.
>
> To be backward compatible to the above configuration, include the
> `max_consecutive_failures` when computing `max_attempts_to_running`.
>
>
> Diffs
> -----
>
> docs/features/services.md 50189eeff26ce9614d092f6abd9246788647fe2b
> src/main/python/apache/aurora/executor/common/health_checker.py 12af9d8635a553eabe918a86508aa6ce2fd78a49
> src/test/python/apache/aurora/executor/common/test_health_checker.py e2a7f164a24f49dd1f4cdba136e838b9d42d73a2
>
> Diff: https://reviews.apache.org/r/54299/diff/
>
>
> Testing
> -------
>
> build-support/jenkins/build.sh
> src/test/sh/org/apacher/aurora/e2e/test_end_to_end.sh
>
>
> Thanks,
>
> Santhosh Kumar Shanmugham
>
>
Re: Review Request 54299: Extend warm-up time by
`max_consecutive_failures` attempts.
Posted by Aurora ReviewBot <wf...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54299/#review158183
-----------------------------------------------------------
Ship it!
Master (4bc5246) is green with this patch.
./build-support/jenkins/build.sh
I will refresh this build result if you post a review containing "@ReviewBot retry"
- Aurora ReviewBot
On Dec. 6, 2016, 4:32 p.m., Santhosh Kumar Shanmugham wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54299/
> -----------------------------------------------------------
>
> (Updated Dec. 6, 2016, 4:32 p.m.)
>
>
> Review request for Aurora, David McLaughlin, Joshua Cohen, and Zameer Manji.
>
>
> Bugs: AURORA-1841
> https://issues.apache.org/jira/browse/AURORA-1841
>
>
> Repository: aurora
>
>
> Description
> -------
>
> It is possible to set the health checks such that a task can
> continually fail health checks with intermittent successes and still
> succeed an update. Essentially a task fails health checks during the
> `initial_interval_secs` and an additional `max_consecutive_failures`,
> and then perform a successful health check to become healthy.
>
> To be backward compatible to the above configuration, include the
> `max_consecutive_failures` when computing `max_attempts_to_running`.
>
>
> Diffs
> -----
>
> docs/features/services.md 50189eeff26ce9614d092f6abd9246788647fe2b
> src/main/python/apache/aurora/executor/common/health_checker.py 12af9d8635a553eabe918a86508aa6ce2fd78a49
> src/test/python/apache/aurora/executor/common/test_health_checker.py e2a7f164a24f49dd1f4cdba136e838b9d42d73a2
>
> Diff: https://reviews.apache.org/r/54299/diff/
>
>
> Testing
> -------
>
> build-support/jenkins/build.sh
> src/test/sh/org/apacher/aurora/e2e/test_end_to_end.sh
>
>
> Thanks,
>
> Santhosh Kumar Shanmugham
>
>
Re: Review Request 54299: Extend warm-up time by
`max_consecutive_failures` attempts.
Posted by David McLaughlin <da...@dmclaughlin.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54299/#review158182
-----------------------------------------------------------
Ship it!
Ship It!
- David McLaughlin
On Dec. 6, 2016, 4:32 p.m., Santhosh Kumar Shanmugham wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54299/
> -----------------------------------------------------------
>
> (Updated Dec. 6, 2016, 4:32 p.m.)
>
>
> Review request for Aurora, David McLaughlin, Joshua Cohen, and Zameer Manji.
>
>
> Bugs: AURORA-1841
> https://issues.apache.org/jira/browse/AURORA-1841
>
>
> Repository: aurora
>
>
> Description
> -------
>
> It is possible to set the health checks such that a task can
> continually fail health checks with intermittent successes and still
> succeed an update. Essentially a task fails health checks during the
> `initial_interval_secs` and an additional `max_consecutive_failures`,
> and then perform a successful health check to become healthy.
>
> To be backward compatible to the above configuration, include the
> `max_consecutive_failures` when computing `max_attempts_to_running`.
>
>
> Diffs
> -----
>
> docs/features/services.md 50189eeff26ce9614d092f6abd9246788647fe2b
> src/main/python/apache/aurora/executor/common/health_checker.py 12af9d8635a553eabe918a86508aa6ce2fd78a49
> src/test/python/apache/aurora/executor/common/test_health_checker.py e2a7f164a24f49dd1f4cdba136e838b9d42d73a2
>
> Diff: https://reviews.apache.org/r/54299/diff/
>
>
> Testing
> -------
>
> build-support/jenkins/build.sh
> src/test/sh/org/apacher/aurora/e2e/test_end_to_end.sh
>
>
> Thanks,
>
> Santhosh Kumar Shanmugham
>
>
Re: Review Request 54299: Extend warm-up time by
`max_consecutive_failures` attempts.
Posted by Joshua Cohen <jc...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54299/#review158490
-----------------------------------------------------------
Ship it!
Ship It!
- Joshua Cohen
On Dec. 8, 2016, 12:15 a.m., Santhosh Kumar Shanmugham wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54299/
> -----------------------------------------------------------
>
> (Updated Dec. 8, 2016, 12:15 a.m.)
>
>
> Review request for Aurora, David McLaughlin, Joshua Cohen, and Zameer Manji.
>
>
> Bugs: AURORA-1841
> https://issues.apache.org/jira/browse/AURORA-1841
>
>
> Repository: aurora
>
>
> Description
> -------
>
> It is possible to set the health checks such that a task can
> continually fail health checks with intermittent successes and still
> succeed an update. Essentially a task fails health checks during the
> `initial_interval_secs` and an additional `max_consecutive_failures`,
> and then perform a successful health check to become healthy.
>
> To be backward compatible to the above configuration, include the
> `max_consecutive_failures` when computing `max_attempts_to_running`.
>
>
> Diffs
> -----
>
> docs/features/services.md 50189eeff26ce9614d092f6abd9246788647fe2b
> src/main/python/apache/aurora/executor/aurora_executor.py d01fcb9594552eb6cdfbdbab2d03707738df3443
> src/main/python/apache/aurora/executor/common/health_checker.py 12af9d8635a553eabe918a86508aa6ce2fd78a49
> src/test/python/apache/aurora/executor/common/test_health_checker.py e2a7f164a24f49dd1f4cdba136e838b9d42d73a2
>
> Diff: https://reviews.apache.org/r/54299/diff/
>
>
> Testing
> -------
>
> build-support/jenkins/build.sh
> src/test/sh/org/apacher/aurora/e2e/test_end_to_end.sh
>
>
> Thanks,
>
> Santhosh Kumar Shanmugham
>
>
Re: Review Request 54299: Extend warm-up time by
`max_consecutive_failures` attempts.
Posted by Aurora ReviewBot <wf...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54299/#review158456
-----------------------------------------------------------
Ship it!
Master (91ddb07) is green with this patch.
./build-support/jenkins/build.sh
I will refresh this build result if you post a review containing "@ReviewBot retry"
- Aurora ReviewBot
On Dec. 8, 2016, 12:15 a.m., Santhosh Kumar Shanmugham wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54299/
> -----------------------------------------------------------
>
> (Updated Dec. 8, 2016, 12:15 a.m.)
>
>
> Review request for Aurora, David McLaughlin, Joshua Cohen, and Zameer Manji.
>
>
> Bugs: AURORA-1841
> https://issues.apache.org/jira/browse/AURORA-1841
>
>
> Repository: aurora
>
>
> Description
> -------
>
> It is possible to set the health checks such that a task can
> continually fail health checks with intermittent successes and still
> succeed an update. Essentially a task fails health checks during the
> `initial_interval_secs` and an additional `max_consecutive_failures`,
> and then perform a successful health check to become healthy.
>
> To be backward compatible to the above configuration, include the
> `max_consecutive_failures` when computing `max_attempts_to_running`.
>
>
> Diffs
> -----
>
> docs/features/services.md 50189eeff26ce9614d092f6abd9246788647fe2b
> src/main/python/apache/aurora/executor/aurora_executor.py d01fcb9594552eb6cdfbdbab2d03707738df3443
> src/main/python/apache/aurora/executor/common/health_checker.py 12af9d8635a553eabe918a86508aa6ce2fd78a49
> src/test/python/apache/aurora/executor/common/test_health_checker.py e2a7f164a24f49dd1f4cdba136e838b9d42d73a2
>
> Diff: https://reviews.apache.org/r/54299/diff/
>
>
> Testing
> -------
>
> build-support/jenkins/build.sh
> src/test/sh/org/apacher/aurora/e2e/test_end_to_end.sh
>
>
> Thanks,
>
> Santhosh Kumar Shanmugham
>
>
Re: Review Request 54299: Extend warm-up time by
`max_consecutive_failures` attempts.
Posted by Santhosh Kumar Shanmugham <sa...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54299/
-----------------------------------------------------------
(Updated Dec. 7, 2016, 4:15 p.m.)
Review request for Aurora, David McLaughlin, Joshua Cohen, and Zameer Manji.
Changes
-------
Return only the reason field instead of the entire StatusResult object.
Bugs: AURORA-1841
https://issues.apache.org/jira/browse/AURORA-1841
Repository: aurora
Description
-------
It is possible to set the health checks such that a task can
continually fail health checks with intermittent successes and still
succeed an update. Essentially a task fails health checks during the
`initial_interval_secs` and an additional `max_consecutive_failures`,
and then perform a successful health check to become healthy.
To be backward compatible to the above configuration, include the
`max_consecutive_failures` when computing `max_attempts_to_running`.
Diffs (updated)
-----
docs/features/services.md 50189eeff26ce9614d092f6abd9246788647fe2b
src/main/python/apache/aurora/executor/aurora_executor.py d01fcb9594552eb6cdfbdbab2d03707738df3443
src/main/python/apache/aurora/executor/common/health_checker.py 12af9d8635a553eabe918a86508aa6ce2fd78a49
src/test/python/apache/aurora/executor/common/test_health_checker.py e2a7f164a24f49dd1f4cdba136e838b9d42d73a2
Diff: https://reviews.apache.org/r/54299/diff/
Testing
-------
build-support/jenkins/build.sh
src/test/sh/org/apacher/aurora/e2e/test_end_to_end.sh
Thanks,
Santhosh Kumar Shanmugham
Re: Review Request 54299: Extend warm-up time by
`max_consecutive_failures` attempts.
Posted by Santhosh Kumar Shanmugham <sa...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54299/
-----------------------------------------------------------
(Updated Dec. 6, 2016, 5:32 p.m.)
Review request for Aurora, David McLaughlin, Joshua Cohen, and Zameer Manji.
Changes
-------
I don't have the necessary time to review this one properly. Sorry.
Bugs: AURORA-1841
https://issues.apache.org/jira/browse/AURORA-1841
Repository: aurora
Description
-------
It is possible to set the health checks such that a task can
continually fail health checks with intermittent successes and still
succeed an update. Essentially a task fails health checks during the
`initial_interval_secs` and an additional `max_consecutive_failures`,
and then perform a successful health check to become healthy.
To be backward compatible to the above configuration, include the
`max_consecutive_failures` when computing `max_attempts_to_running`.
Diffs (updated)
-----
docs/features/services.md 50189eeff26ce9614d092f6abd9246788647fe2b
src/main/python/apache/aurora/executor/common/health_checker.py 12af9d8635a553eabe918a86508aa6ce2fd78a49
src/test/python/apache/aurora/executor/common/test_health_checker.py e2a7f164a24f49dd1f4cdba136e838b9d42d73a2
Diff: https://reviews.apache.org/r/54299/diff/
Testing
-------
build-support/jenkins/build.sh
src/test/sh/org/apacher/aurora/e2e/test_end_to_end.sh
Thanks,
Santhosh Kumar Shanmugham
Re: Review Request 54299: Extend warm-up time by
`max_consecutive_failures` attempts.
Posted by Aurora ReviewBot <wf...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54299/#review157722
-----------------------------------------------------------
Ship it!
Master (3ea0331) is green with this patch.
./build-support/jenkins/build.sh
I will refresh this build result if you post a review containing "@ReviewBot retry"
- Aurora ReviewBot
On Dec. 2, 2016, 8:43 a.m., Santhosh Kumar Shanmugham wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54299/
> -----------------------------------------------------------
>
> (Updated Dec. 2, 2016, 8:43 a.m.)
>
>
> Review request for Aurora, David McLaughlin, Joshua Cohen, Stephan Erb, and Zameer Manji.
>
>
> Bugs: AURORA-1841
> https://issues.apache.org/jira/browse/AURORA-1841
>
>
> Repository: aurora
>
>
> Description
> -------
>
> It is possible to set the health checks such that a task can
> continually fail health checks with intermittent successes and still
> succeed an update. Essentially a task fails health checks during the
> `initial_interval_secs` and an additional `max_consecutive_failures`,
> and then perform a successful health check to become healthy.
>
> To be backward compatible to the above configuration, include the
> `max_consecutive_failures` when computing `max_attempts_to_running`.
>
>
> Diffs
> -----
>
> docs/features/services.md 50189eeff26ce9614d092f6abd9246788647fe2b
> src/main/python/apache/aurora/executor/common/health_checker.py 12af9d8635a553eabe918a86508aa6ce2fd78a49
> src/test/python/apache/aurora/executor/common/test_health_checker.py e2a7f164a24f49dd1f4cdba136e838b9d42d73a2
>
> Diff: https://reviews.apache.org/r/54299/diff/
>
>
> Testing
> -------
>
> build-support/jenkins/build.sh
> src/test/sh/org/apacher/aurora/e2e/test_end_to_end.sh
>
>
> Thanks,
>
> Santhosh Kumar Shanmugham
>
>