You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@aurora.apache.org by David McLaughlin <da...@dmclaughlin.com> on 2018/01/23 06:02:35 UTC
Review Request 65281: Support PARTITIONED state in SLA calculations
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65281/
-----------------------------------------------------------
Review request for Aurora and Jordan Ly.
Repository: aurora
Description
-------
Support PARTITIONED state in SLA calculations. Also added a test to protect against this test failing in the future.
Diffs
-----
src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java 5d8d5bd8f705770979f284d26d2e932aabe707e5
src/test/java/org/apache/aurora/scheduler/sla/SlaAlgorithmTest.java 2e719ac6b7aea86faa22deff2cc6b5f73135761c
Diff: https://reviews.apache.org/r/65281/diff/1/
Testing
-------
./gradlew test
Thanks,
David McLaughlin
Re: Review Request 65281: Support PARTITIONED state in SLA
calculations
Posted by Jordan Ly <jo...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65281/#review196046
-----------------------------------------------------------
Ship it!
Nice test for future cases.
- Jordan Ly
On Jan. 23, 2018, 6:02 a.m., David McLaughlin wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65281/
> -----------------------------------------------------------
>
> (Updated Jan. 23, 2018, 6:02 a.m.)
>
>
> Review request for Aurora and Jordan Ly.
>
>
> Repository: aurora
>
>
> Description
> -------
>
> Support PARTITIONED state in SLA calculations. Also added a test to protect against this test failing in the future.
>
>
> Diffs
> -----
>
> src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java 5d8d5bd8f705770979f284d26d2e932aabe707e5
> src/test/java/org/apache/aurora/scheduler/sla/SlaAlgorithmTest.java 2e719ac6b7aea86faa22deff2cc6b5f73135761c
>
>
> Diff: https://reviews.apache.org/r/65281/diff/1/
>
>
> Testing
> -------
>
> ./gradlew test
>
>
> Thanks,
>
> David McLaughlin
>
>
Re: Review Request 65281: Support PARTITIONED state in SLA
calculations
Posted by Stephan Erb <se...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65281/#review196010
-----------------------------------------------------------
Ship it!
Ship It!
- Stephan Erb
On Jan. 23, 2018, 7:02 a.m., David McLaughlin wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65281/
> -----------------------------------------------------------
>
> (Updated Jan. 23, 2018, 7:02 a.m.)
>
>
> Review request for Aurora and Jordan Ly.
>
>
> Repository: aurora
>
>
> Description
> -------
>
> Support PARTITIONED state in SLA calculations. Also added a test to protect against this test failing in the future.
>
>
> Diffs
> -----
>
> src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java 5d8d5bd8f705770979f284d26d2e932aabe707e5
> src/test/java/org/apache/aurora/scheduler/sla/SlaAlgorithmTest.java 2e719ac6b7aea86faa22deff2cc6b5f73135761c
>
>
> Diff: https://reviews.apache.org/r/65281/diff/1/
>
>
> Testing
> -------
>
> ./gradlew test
>
>
> Thanks,
>
> David McLaughlin
>
>
Re: Review Request 65281: Support PARTITIONED state in SLA
calculations
Posted by Aurora ReviewBot <wf...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65281/#review196196
-----------------------------------------------------------
Ship it!
Master (dbe7137) is green with this patch.
./build-support/jenkins/build.sh
I will refresh this build result if you post a review containing "@ReviewBot retry"
- Aurora ReviewBot
On Jan. 25, 2018, 2:04 a.m., David McLaughlin wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65281/
> -----------------------------------------------------------
>
> (Updated Jan. 25, 2018, 2:04 a.m.)
>
>
> Review request for Aurora and Jordan Ly.
>
>
> Repository: aurora
>
>
> Description
> -------
>
> Support PARTITIONED state in SLA calculations. Also added a test to protect against this test failing in the future.
>
>
> Diffs
> -----
>
> src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java 5d8d5bd8f705770979f284d26d2e932aabe707e5
> src/test/java/org/apache/aurora/scheduler/sla/SlaAlgorithmTest.java 2e719ac6b7aea86faa22deff2cc6b5f73135761c
>
>
> Diff: https://reviews.apache.org/r/65281/diff/2/
>
>
> Testing
> -------
>
> ./gradlew test
>
>
> Thanks,
>
> David McLaughlin
>
>
Re: Review Request 65281: Support PARTITIONED state in SLA
calculations
Posted by David McLaughlin <da...@dmclaughlin.com>.
> On Jan. 25, 2018, 2:59 a.m., Santhosh Kumar Shanmugham wrote:
> > src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java
> > Lines 333 (patched)
> > <https://reviews.apache.org/r/65281/diff/2/?file=1946477#file1946477line333>
> >
> > We should only consider `UP` if the previous state is also `UP` for `PARTITIONED` state. For instance, `KILLING` -> `PARTITIONED` should we counted as `REMOVED`.
>
> David McLaughlin wrote:
> You cannot move from KILLING to PARTITIONED.
>
> Santhosh Kumar Shanmugham wrote:
> I was able to trigger a transition from KILLING to PARTITIONED.
>
> - Creating job
> ```
> vagrant@aurora:~$ aurora job create devcluster/vagrant/test/partition_aware_disabled aurora/src/test/sh/org/apache/aurora/e2e/partition_aware.aurora
> INFO] Creating job partition_aware_disabled
> INFO] Checking status of devcluster/vagrant/test/partition_aware_disabled
> Job create succeeded: job url=http://aurora.local:8081/scheduler/vagrant/test/partition_aware_disabled
> ```
>
> - Checking Status
> ```
> vagrant@aurora:~$ aurora job status devcluster
> INFO] Retrieving jobs for role None
> INFO] Checking status of devcluster/vagrant/test/partition_aware_disabled
> Active tasks (1):
> Task role: vagrant, env: test, name: partition_aware_disabled, instance: 0, status: RUNNING on 192.168.33.7
> CPU: 0.2 core(s), RAM: 1 MB, Disk: 8 MB
> events:
> 2018-01-25 04:07:08 PENDING: None
> 2018-01-25 04:07:08 ASSIGNED: None
> 2018-01-25 04:07:09 STARTING: None
> 2018-01-25 04:07:10 RUNNING: No health-check defined, task is assumed healthy.
> ```
>
> - Create Partition
> ```
> vagrant@aurora:~$ sudo stop mesos-slave
> mesos-slave stop/waiting
> ```
>
> - Kill task
> ```
> vagrant@aurora:~$ aurora job killall devcluster/vagrant/test/partition_aware_disabled
> INFO] Killing tasks for job: devcluster/vagrant/test/partition_aware_disabled
> INFO] Instances to be killed: [0]
> Instances [0] were not killed in time
> Exceeded maximum number of errors while killing instances
> ```
>
> - Checking Status
> ```
> vagrant@aurora:~$ aurora job status devcluster
> INFO] Retrieving jobs for role None
> INFO] Checking status of devcluster/vagrant/test/partition_aware_disabled
> Active tasks (1):
> Task role: vagrant, env: test, name: partition_aware_disabled, instance: 0, status: KILLING on 192.168.33.7
> CPU: 0.2 core(s), RAM: 1 MB, Disk: 8 MB
> events:
> 2018-01-25 04:07:08 PENDING: None
> 2018-01-25 04:07:08 ASSIGNED: None
> 2018-01-25 04:07:09 STARTING: None
> 2018-01-25 04:07:10 RUNNING: No health-check defined, task is assumed healthy.
> 2018-01-25 04:10:10 KILLING: Killed by aurora
> ```
>
> Scheduler Logs:
> ```
> I0125 04:11:18.817 [Thread-66, MesosCallbackHandler$MesosCallbackHandlerImpl] Received status update for task vagrant-test-partition_aware_disabled-0-3cb6e9ae-6643-460e-a08b-0bdba4bdd8d4 in state TASK_UNREACHABLE from SOURCE_MASTER with REASON_AGENT_REMOVED: Agent 192.168.33.7 is unreachable: health check timed out
> W0125 04:11:18.818 [AsyncProcessor-0, Stats] Re-using already registered variable for key task_delivery_delay_SOURCE_MASTER_timeouts_per_sec
> W0125 04:11:18.818 [AsyncProcessor-0, Stats] Re-using already registered variable for key task_delivery_delay_SOURCE_MASTER_requests_per_sec
> I0125 04:11:18.819 [TaskStatusHandlerImpl, StateMachine] vagrant-test-partition_aware_disabled-0-3cb6e9ae-6643-460e-a08b-0bdba4bdd8d4 state machine transition KILLING -> PARTITIONED
> I0125 04:11:18.820 [TaskStatusHandlerImpl, StateMachine] vagrant-test-partition_aware_disabled-0-3cb6e9ae-6643-460e-a08b-0bdba4bdd8d4 state machine transition PARTITIONED -> LOST
> I0125 04:11:18.820 [TaskStatusHandlerImpl, StateManagerImpl] Task being rescheduled: vagrant-test-partition_aware_disabled-0-3cb6e9ae-6643-460e-a08b-0bdba4bdd8d4
> I0125 04:11:18.821 [TaskStatusHandlerImpl, StateMachine] vagrant-test-partition_aware_disabled-0-27c9bed4-708d-4d71-a0c1-33584b81c654 state machine transition INIT -> PENDING
> ```
>
> As pointed out in https://issues.apache.org/jira/browse/AURORA-1966, this causes the Scheduler to indefinitely keep killing the partitioned task.
Right, that bug aside, KILLING -> PARTITIONED triggers an immediate transition to LOST. The intermediate state of PARTITONED is basically a noop.
- David
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65281/#review196198
-----------------------------------------------------------
On Jan. 25, 2018, 2:04 a.m., David McLaughlin wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65281/
> -----------------------------------------------------------
>
> (Updated Jan. 25, 2018, 2:04 a.m.)
>
>
> Review request for Aurora and Jordan Ly.
>
>
> Repository: aurora
>
>
> Description
> -------
>
> Support PARTITIONED state in SLA calculations. Also added a test to protect against this test failing in the future.
>
>
> Diffs
> -----
>
> src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java 5d8d5bd8f705770979f284d26d2e932aabe707e5
> src/test/java/org/apache/aurora/scheduler/sla/SlaAlgorithmTest.java 2e719ac6b7aea86faa22deff2cc6b5f73135761c
>
>
> Diff: https://reviews.apache.org/r/65281/diff/2/
>
>
> Testing
> -------
>
> ./gradlew test
>
>
> Thanks,
>
> David McLaughlin
>
>
Re: Review Request 65281: Support PARTITIONED state in SLA
calculations
Posted by David McLaughlin <da...@dmclaughlin.com>.
> On Jan. 25, 2018, 2:59 a.m., Santhosh Kumar Shanmugham wrote:
> > src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java
> > Lines 333 (patched)
> > <https://reviews.apache.org/r/65281/diff/2/?file=1946477#file1946477line333>
> >
> > We should only consider `UP` if the previous state is also `UP` for `PARTITIONED` state. For instance, `KILLING` -> `PARTITIONED` should we counted as `REMOVED`.
You cannot move from KILLING to PARTITIONED.
- David
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65281/#review196198
-----------------------------------------------------------
On Jan. 25, 2018, 2:04 a.m., David McLaughlin wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65281/
> -----------------------------------------------------------
>
> (Updated Jan. 25, 2018, 2:04 a.m.)
>
>
> Review request for Aurora and Jordan Ly.
>
>
> Repository: aurora
>
>
> Description
> -------
>
> Support PARTITIONED state in SLA calculations. Also added a test to protect against this test failing in the future.
>
>
> Diffs
> -----
>
> src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java 5d8d5bd8f705770979f284d26d2e932aabe707e5
> src/test/java/org/apache/aurora/scheduler/sla/SlaAlgorithmTest.java 2e719ac6b7aea86faa22deff2cc6b5f73135761c
>
>
> Diff: https://reviews.apache.org/r/65281/diff/2/
>
>
> Testing
> -------
>
> ./gradlew test
>
>
> Thanks,
>
> David McLaughlin
>
>
Re: Review Request 65281: Support PARTITIONED state in SLA
calculations
Posted by Santhosh Kumar Shanmugham <sa...@gmail.com>.
> On Jan. 24, 2018, 6:59 p.m., Santhosh Kumar Shanmugham wrote:
> > src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java
> > Lines 333 (patched)
> > <https://reviews.apache.org/r/65281/diff/2/?file=1946477#file1946477line333>
> >
> > We should only consider `UP` if the previous state is also `UP` for `PARTITIONED` state. For instance, `KILLING` -> `PARTITIONED` should we counted as `REMOVED`.
>
> David McLaughlin wrote:
> You cannot move from KILLING to PARTITIONED.
I was able to trigger a transition from KILLING to PARTITIONED.
- Creating job
```
vagrant@aurora:~$ aurora job create devcluster/vagrant/test/partition_aware_disabled aurora/src/test/sh/org/apache/aurora/e2e/partition_aware.aurora
INFO] Creating job partition_aware_disabled
INFO] Checking status of devcluster/vagrant/test/partition_aware_disabled
Job create succeeded: job url=http://aurora.local:8081/scheduler/vagrant/test/partition_aware_disabled
```
- Checking Status
```
vagrant@aurora:~$ aurora job status devcluster
INFO] Retrieving jobs for role None
INFO] Checking status of devcluster/vagrant/test/partition_aware_disabled
Active tasks (1):
Task role: vagrant, env: test, name: partition_aware_disabled, instance: 0, status: RUNNING on 192.168.33.7
CPU: 0.2 core(s), RAM: 1 MB, Disk: 8 MB
events:
2018-01-25 04:07:08 PENDING: None
2018-01-25 04:07:08 ASSIGNED: None
2018-01-25 04:07:09 STARTING: None
2018-01-25 04:07:10 RUNNING: No health-check defined, task is assumed healthy.
```
- Create Partition
```
vagrant@aurora:~$ sudo stop mesos-slave
mesos-slave stop/waiting
```
- Kill task
```
vagrant@aurora:~$ aurora job killall devcluster/vagrant/test/partition_aware_disabled
INFO] Killing tasks for job: devcluster/vagrant/test/partition_aware_disabled
INFO] Instances to be killed: [0]
Instances [0] were not killed in time
Exceeded maximum number of errors while killing instances
```
- Checking Status
```
vagrant@aurora:~$ aurora job status devcluster
INFO] Retrieving jobs for role None
INFO] Checking status of devcluster/vagrant/test/partition_aware_disabled
Active tasks (1):
Task role: vagrant, env: test, name: partition_aware_disabled, instance: 0, status: KILLING on 192.168.33.7
CPU: 0.2 core(s), RAM: 1 MB, Disk: 8 MB
events:
2018-01-25 04:07:08 PENDING: None
2018-01-25 04:07:08 ASSIGNED: None
2018-01-25 04:07:09 STARTING: None
2018-01-25 04:07:10 RUNNING: No health-check defined, task is assumed healthy.
2018-01-25 04:10:10 KILLING: Killed by aurora
```
Scheduler Logs:
```
I0125 04:11:18.817 [Thread-66, MesosCallbackHandler$MesosCallbackHandlerImpl] Received status update for task vagrant-test-partition_aware_disabled-0-3cb6e9ae-6643-460e-a08b-0bdba4bdd8d4 in state TASK_UNREACHABLE from SOURCE_MASTER with REASON_AGENT_REMOVED: Agent 192.168.33.7 is unreachable: health check timed out
W0125 04:11:18.818 [AsyncProcessor-0, Stats] Re-using already registered variable for key task_delivery_delay_SOURCE_MASTER_timeouts_per_sec
W0125 04:11:18.818 [AsyncProcessor-0, Stats] Re-using already registered variable for key task_delivery_delay_SOURCE_MASTER_requests_per_sec
I0125 04:11:18.819 [TaskStatusHandlerImpl, StateMachine] vagrant-test-partition_aware_disabled-0-3cb6e9ae-6643-460e-a08b-0bdba4bdd8d4 state machine transition KILLING -> PARTITIONED
I0125 04:11:18.820 [TaskStatusHandlerImpl, StateMachine] vagrant-test-partition_aware_disabled-0-3cb6e9ae-6643-460e-a08b-0bdba4bdd8d4 state machine transition PARTITIONED -> LOST
I0125 04:11:18.820 [TaskStatusHandlerImpl, StateManagerImpl] Task being rescheduled: vagrant-test-partition_aware_disabled-0-3cb6e9ae-6643-460e-a08b-0bdba4bdd8d4
I0125 04:11:18.821 [TaskStatusHandlerImpl, StateMachine] vagrant-test-partition_aware_disabled-0-27c9bed4-708d-4d71-a0c1-33584b81c654 state machine transition INIT -> PENDING
```
As pointed out in https://issues.apache.org/jira/browse/AURORA-1966, this causes the Scheduler to indefinitely keep killing the partitioned task.
- Santhosh Kumar
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65281/#review196198
-----------------------------------------------------------
On Jan. 24, 2018, 6:04 p.m., David McLaughlin wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65281/
> -----------------------------------------------------------
>
> (Updated Jan. 24, 2018, 6:04 p.m.)
>
>
> Review request for Aurora and Jordan Ly.
>
>
> Repository: aurora
>
>
> Description
> -------
>
> Support PARTITIONED state in SLA calculations. Also added a test to protect against this test failing in the future.
>
>
> Diffs
> -----
>
> src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java 5d8d5bd8f705770979f284d26d2e932aabe707e5
> src/test/java/org/apache/aurora/scheduler/sla/SlaAlgorithmTest.java 2e719ac6b7aea86faa22deff2cc6b5f73135761c
>
>
> Diff: https://reviews.apache.org/r/65281/diff/2/
>
>
> Testing
> -------
>
> ./gradlew test
>
>
> Thanks,
>
> David McLaughlin
>
>
Re: Review Request 65281: Support PARTITIONED state in SLA
calculations
Posted by Santhosh Kumar Shanmugham <sa...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65281/#review196198
-----------------------------------------------------------
src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java
Lines 333 (patched)
<https://reviews.apache.org/r/65281/#comment275747>
We should only consider `UP` if the previous state is also `UP` for `PARTITIONED` state. For instance, `KILLING` -> `PARTITIONED` should we counted as `REMOVED`.
- Santhosh Kumar Shanmugham
On Jan. 24, 2018, 6:04 p.m., David McLaughlin wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65281/
> -----------------------------------------------------------
>
> (Updated Jan. 24, 2018, 6:04 p.m.)
>
>
> Review request for Aurora and Jordan Ly.
>
>
> Repository: aurora
>
>
> Description
> -------
>
> Support PARTITIONED state in SLA calculations. Also added a test to protect against this test failing in the future.
>
>
> Diffs
> -----
>
> src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java 5d8d5bd8f705770979f284d26d2e932aabe707e5
> src/test/java/org/apache/aurora/scheduler/sla/SlaAlgorithmTest.java 2e719ac6b7aea86faa22deff2cc6b5f73135761c
>
>
> Diff: https://reviews.apache.org/r/65281/diff/2/
>
>
> Testing
> -------
>
> ./gradlew test
>
>
> Thanks,
>
> David McLaughlin
>
>
Re: Review Request 65281: Support PARTITIONED state in SLA
calculations
Posted by David McLaughlin <da...@dmclaughlin.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65281/
-----------------------------------------------------------
(Updated Jan. 25, 2018, 2:04 a.m.)
Review request for Aurora and Jordan Ly.
Changes
-------
Treat PARTITIONED as UP.
Repository: aurora
Description
-------
Support PARTITIONED state in SLA calculations. Also added a test to protect against this test failing in the future.
Diffs (updated)
-----
src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java 5d8d5bd8f705770979f284d26d2e932aabe707e5
src/test/java/org/apache/aurora/scheduler/sla/SlaAlgorithmTest.java 2e719ac6b7aea86faa22deff2cc6b5f73135761c
Diff: https://reviews.apache.org/r/65281/diff/2/
Changes: https://reviews.apache.org/r/65281/diff/1-2/
Testing
-------
./gradlew test
Thanks,
David McLaughlin
Re: Review Request 65281: Support PARTITIONED state in SLA
calculations
Posted by Aurora ReviewBot <wf...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65281/#review195975
-----------------------------------------------------------
Ship it!
Master (dbe7137) is green with this patch.
./build-support/jenkins/build.sh
I will refresh this build result if you post a review containing "@ReviewBot retry"
- Aurora ReviewBot
On Jan. 23, 2018, 6:02 a.m., David McLaughlin wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65281/
> -----------------------------------------------------------
>
> (Updated Jan. 23, 2018, 6:02 a.m.)
>
>
> Review request for Aurora and Jordan Ly.
>
>
> Repository: aurora
>
>
> Description
> -------
>
> Support PARTITIONED state in SLA calculations. Also added a test to protect against this test failing in the future.
>
>
> Diffs
> -----
>
> src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java 5d8d5bd8f705770979f284d26d2e932aabe707e5
> src/test/java/org/apache/aurora/scheduler/sla/SlaAlgorithmTest.java 2e719ac6b7aea86faa22deff2cc6b5f73135761c
>
>
> Diff: https://reviews.apache.org/r/65281/diff/1/
>
>
> Testing
> -------
>
> ./gradlew test
>
>
> Thanks,
>
> David McLaughlin
>
>
Re: Review Request 65281: Support PARTITIONED state in SLA
calculations
Posted by Stephan Erb <se...@apache.org>.
> On Jan. 23, 2018, 7:11 p.m., Santhosh Kumar Shanmugham wrote:
> > src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java
> > Lines 319 (patched)
> > <https://reviews.apache.org/r/65281/diff/1/?file=1944352#file1944352line319>
> >
> > Shouldn't `RUNNING` -> `PARTITIONED` be `DOWN`?
>
> Bill Farner wrote:
> +1
I considered this as at first as well. However, it is the users choice to be in `PARTITIONED` state, so this should not be counted as "cluster-induced downtime". In addition, `PARTITIONED` does not indicate that the task is down. It merely means that we don't know.
- Stephan
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65281/#review196047
-----------------------------------------------------------
On Jan. 23, 2018, 7:02 a.m., David McLaughlin wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65281/
> -----------------------------------------------------------
>
> (Updated Jan. 23, 2018, 7:02 a.m.)
>
>
> Review request for Aurora and Jordan Ly.
>
>
> Repository: aurora
>
>
> Description
> -------
>
> Support PARTITIONED state in SLA calculations. Also added a test to protect against this test failing in the future.
>
>
> Diffs
> -----
>
> src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java 5d8d5bd8f705770979f284d26d2e932aabe707e5
> src/test/java/org/apache/aurora/scheduler/sla/SlaAlgorithmTest.java 2e719ac6b7aea86faa22deff2cc6b5f73135761c
>
>
> Diff: https://reviews.apache.org/r/65281/diff/1/
>
>
> Testing
> -------
>
> ./gradlew test
>
>
> Thanks,
>
> David McLaughlin
>
>
Re: Review Request 65281: Support PARTITIONED state in SLA
calculations
Posted by David McLaughlin <da...@dmclaughlin.com>.
> On Jan. 23, 2018, 6:11 p.m., Santhosh Kumar Shanmugham wrote:
> > src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java
> > Lines 319 (patched)
> > <https://reviews.apache.org/r/65281/diff/1/?file=1944352#file1944352line319>
> >
> > Shouldn't `RUNNING` -> `PARTITIONED` be `DOWN`?
>
> Bill Farner wrote:
> +1
>
> Stephan Erb wrote:
> I considered this as at first as well. However, it is the users choice to be in `PARTITIONED` state, so this should not be counted as "cluster-induced downtime". In addition, `PARTITIONED` does not indicate that the task is down. It merely means that we don't know.
>
> David McLaughlin wrote:
> That was my thought too. They are optimistically still running.
>
> Santhosh Kumar Shanmugham wrote:
> That makes sense.
>
> Shouldn't we then include the time spent in `PARTITIONED` as `UP` in that case? If I am reading it correctly, we will exclude time spent in `PARTITIONED`, even if the task came an `UP` state.
You're right, this should carry the same logic as RUNNING.
- David
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65281/#review196047
-----------------------------------------------------------
On Jan. 23, 2018, 6:02 a.m., David McLaughlin wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65281/
> -----------------------------------------------------------
>
> (Updated Jan. 23, 2018, 6:02 a.m.)
>
>
> Review request for Aurora and Jordan Ly.
>
>
> Repository: aurora
>
>
> Description
> -------
>
> Support PARTITIONED state in SLA calculations. Also added a test to protect against this test failing in the future.
>
>
> Diffs
> -----
>
> src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java 5d8d5bd8f705770979f284d26d2e932aabe707e5
> src/test/java/org/apache/aurora/scheduler/sla/SlaAlgorithmTest.java 2e719ac6b7aea86faa22deff2cc6b5f73135761c
>
>
> Diff: https://reviews.apache.org/r/65281/diff/1/
>
>
> Testing
> -------
>
> ./gradlew test
>
>
> Thanks,
>
> David McLaughlin
>
>
Re: Review Request 65281: Support PARTITIONED state in SLA
calculations
Posted by David McLaughlin <da...@dmclaughlin.com>.
> On Jan. 23, 2018, 6:11 p.m., Santhosh Kumar Shanmugham wrote:
> > src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java
> > Lines 319 (patched)
> > <https://reviews.apache.org/r/65281/diff/1/?file=1944352#file1944352line319>
> >
> > Shouldn't `RUNNING` -> `PARTITIONED` be `DOWN`?
>
> Bill Farner wrote:
> +1
>
> Stephan Erb wrote:
> I considered this as at first as well. However, it is the users choice to be in `PARTITIONED` state, so this should not be counted as "cluster-induced downtime". In addition, `PARTITIONED` does not indicate that the task is down. It merely means that we don't know.
>
> David McLaughlin wrote:
> That was my thought too. They are optimistically still running.
>
> Santhosh Kumar Shanmugham wrote:
> That makes sense.
>
> Shouldn't we then include the time spent in `PARTITIONED` as `UP` in that case? If I am reading it correctly, we will exclude time spent in `PARTITIONED`, even if the task came an `UP` state.
>
> David McLaughlin wrote:
> You're right, this should carry the same logic as RUNNING.
I've made this change. Santhosh I'll wait for your ship it before commit.
Question: I put PARTITIONED here blindly because I saw STARTING. STARTING is now a legitimate running state... should STARTING also be considered UP? I'm not sure if it actually impacts the SLA % reported, but do I know a whole bunch of crons can stay in STARTING for their whole duration.
- David
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65281/#review196047
-----------------------------------------------------------
On Jan. 25, 2018, 2:04 a.m., David McLaughlin wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65281/
> -----------------------------------------------------------
>
> (Updated Jan. 25, 2018, 2:04 a.m.)
>
>
> Review request for Aurora and Jordan Ly.
>
>
> Repository: aurora
>
>
> Description
> -------
>
> Support PARTITIONED state in SLA calculations. Also added a test to protect against this test failing in the future.
>
>
> Diffs
> -----
>
> src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java 5d8d5bd8f705770979f284d26d2e932aabe707e5
> src/test/java/org/apache/aurora/scheduler/sla/SlaAlgorithmTest.java 2e719ac6b7aea86faa22deff2cc6b5f73135761c
>
>
> Diff: https://reviews.apache.org/r/65281/diff/2/
>
>
> Testing
> -------
>
> ./gradlew test
>
>
> Thanks,
>
> David McLaughlin
>
>
Re: Review Request 65281: Support PARTITIONED state in SLA
calculations
Posted by David McLaughlin <da...@dmclaughlin.com>.
> On Jan. 23, 2018, 6:11 p.m., Santhosh Kumar Shanmugham wrote:
> > src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java
> > Lines 319 (patched)
> > <https://reviews.apache.org/r/65281/diff/1/?file=1944352#file1944352line319>
> >
> > Shouldn't `RUNNING` -> `PARTITIONED` be `DOWN`?
>
> Bill Farner wrote:
> +1
>
> Stephan Erb wrote:
> I considered this as at first as well. However, it is the users choice to be in `PARTITIONED` state, so this should not be counted as "cluster-induced downtime". In addition, `PARTITIONED` does not indicate that the task is down. It merely means that we don't know.
That was my thought too. They are optimistically still running.
- David
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65281/#review196047
-----------------------------------------------------------
On Jan. 23, 2018, 6:02 a.m., David McLaughlin wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65281/
> -----------------------------------------------------------
>
> (Updated Jan. 23, 2018, 6:02 a.m.)
>
>
> Review request for Aurora and Jordan Ly.
>
>
> Repository: aurora
>
>
> Description
> -------
>
> Support PARTITIONED state in SLA calculations. Also added a test to protect against this test failing in the future.
>
>
> Diffs
> -----
>
> src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java 5d8d5bd8f705770979f284d26d2e932aabe707e5
> src/test/java/org/apache/aurora/scheduler/sla/SlaAlgorithmTest.java 2e719ac6b7aea86faa22deff2cc6b5f73135761c
>
>
> Diff: https://reviews.apache.org/r/65281/diff/1/
>
>
> Testing
> -------
>
> ./gradlew test
>
>
> Thanks,
>
> David McLaughlin
>
>
Re: Review Request 65281: Support PARTITIONED state in SLA
calculations
Posted by Bill Farner <wf...@apache.org>.
> On Jan. 23, 2018, 10:11 a.m., Santhosh Kumar Shanmugham wrote:
> > src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java
> > Lines 319 (patched)
> > <https://reviews.apache.org/r/65281/diff/1/?file=1944352#file1944352line319>
> >
> > Shouldn't `RUNNING` -> `PARTITIONED` be `DOWN`?
+1
- Bill
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65281/#review196047
-----------------------------------------------------------
On Jan. 22, 2018, 10:02 p.m., David McLaughlin wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65281/
> -----------------------------------------------------------
>
> (Updated Jan. 22, 2018, 10:02 p.m.)
>
>
> Review request for Aurora and Jordan Ly.
>
>
> Repository: aurora
>
>
> Description
> -------
>
> Support PARTITIONED state in SLA calculations. Also added a test to protect against this test failing in the future.
>
>
> Diffs
> -----
>
> src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java 5d8d5bd8f705770979f284d26d2e932aabe707e5
> src/test/java/org/apache/aurora/scheduler/sla/SlaAlgorithmTest.java 2e719ac6b7aea86faa22deff2cc6b5f73135761c
>
>
> Diff: https://reviews.apache.org/r/65281/diff/1/
>
>
> Testing
> -------
>
> ./gradlew test
>
>
> Thanks,
>
> David McLaughlin
>
>
Re: Review Request 65281: Support PARTITIONED state in SLA
calculations
Posted by Santhosh Kumar Shanmugham <sa...@gmail.com>.
> On Jan. 23, 2018, 10:11 a.m., Santhosh Kumar Shanmugham wrote:
> > src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java
> > Lines 319 (patched)
> > <https://reviews.apache.org/r/65281/diff/1/?file=1944352#file1944352line319>
> >
> > Shouldn't `RUNNING` -> `PARTITIONED` be `DOWN`?
>
> Bill Farner wrote:
> +1
>
> Stephan Erb wrote:
> I considered this as at first as well. However, it is the users choice to be in `PARTITIONED` state, so this should not be counted as "cluster-induced downtime". In addition, `PARTITIONED` does not indicate that the task is down. It merely means that we don't know.
>
> David McLaughlin wrote:
> That was my thought too. They are optimistically still running.
That makes sense.
Shouldn't we then include the time spent in `PARTITIONED` as `UP` in that case? If I am reading it correctly, we will exclude time spent in `PARTITIONED`, even if the task came an `UP` state.
- Santhosh Kumar
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65281/#review196047
-----------------------------------------------------------
On Jan. 22, 2018, 10:02 p.m., David McLaughlin wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65281/
> -----------------------------------------------------------
>
> (Updated Jan. 22, 2018, 10:02 p.m.)
>
>
> Review request for Aurora and Jordan Ly.
>
>
> Repository: aurora
>
>
> Description
> -------
>
> Support PARTITIONED state in SLA calculations. Also added a test to protect against this test failing in the future.
>
>
> Diffs
> -----
>
> src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java 5d8d5bd8f705770979f284d26d2e932aabe707e5
> src/test/java/org/apache/aurora/scheduler/sla/SlaAlgorithmTest.java 2e719ac6b7aea86faa22deff2cc6b5f73135761c
>
>
> Diff: https://reviews.apache.org/r/65281/diff/1/
>
>
> Testing
> -------
>
> ./gradlew test
>
>
> Thanks,
>
> David McLaughlin
>
>
Re: Review Request 65281: Support PARTITIONED state in SLA
calculations
Posted by Santhosh Kumar Shanmugham <sa...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65281/#review196047
-----------------------------------------------------------
src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java
Lines 319 (patched)
<https://reviews.apache.org/r/65281/#comment275563>
Shouldn't `RUNNING` -> `PARTITIONED` be `DOWN`?
- Santhosh Kumar Shanmugham
On Jan. 22, 2018, 10:02 p.m., David McLaughlin wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65281/
> -----------------------------------------------------------
>
> (Updated Jan. 22, 2018, 10:02 p.m.)
>
>
> Review request for Aurora and Jordan Ly.
>
>
> Repository: aurora
>
>
> Description
> -------
>
> Support PARTITIONED state in SLA calculations. Also added a test to protect against this test failing in the future.
>
>
> Diffs
> -----
>
> src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java 5d8d5bd8f705770979f284d26d2e932aabe707e5
> src/test/java/org/apache/aurora/scheduler/sla/SlaAlgorithmTest.java 2e719ac6b7aea86faa22deff2cc6b5f73135761c
>
>
> Diff: https://reviews.apache.org/r/65281/diff/1/
>
>
> Testing
> -------
>
> ./gradlew test
>
>
> Thanks,
>
> David McLaughlin
>
>