You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@yunikorn.apache.org by GitBox <gi...@apache.org> on 2020/09/11 07:10:27 UTC
[GitHub] [incubator-yunikorn-k8shim] TaoYang526 opened a new pull request #190: [YUNIKORN-410] Pod state change may cause incorrect update on SchedulerNode#occupied
TaoYang526 opened a new pull request #190:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/190
Details please refer to [YUNIKORN-410](https://issues.apache.org/jira/browse/YUNIKORN-410)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-yunikorn-k8shim] yangwwei commented on a change in pull request #190: [YUNIKORN-410] Pod state change may cause incorrect update on SchedulerNode#occupied
Posted by GitBox <gi...@apache.org>.
yangwwei commented on a change in pull request #190:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/190#discussion_r487263705
##########
File path: pkg/cache/node_coordinator_test.go
##########
@@ -248,3 +248,45 @@ func TestDeleteTerminatedPod(t *testing.T) {
coordinator.deletePod(pod2)
assert.Equal(t, executed, false)
}
+
+func TestUpdatePodPendingRunningAlternatively(t *testing.T) {
+ mockedSchedulerAPI := newMockSchedulerAPI()
+ nodes := newSchedulerNodes(mockedSchedulerAPI, NewTestSchedulerCache())
+ host1 := utils.NodeForTest("HOST1", "10G", "10")
+ nodes.addNode(host1)
+ coordinator := newNodeResourceCoordinator(nodes)
+
+ // init pod (pod1) and changed pod (pod2)
+ pod1 := utils.PodForTest("pod1", "1G", "500m")
+ pod1.SetUID("UID-01")
+ pod2 := utils.PodForTest("pod1", "1G", "500m")
+ pod2.SetUID("UID-01")
+
+ // pod state changed from pending to running, trigger an update
+ pod1.Status.Phase = v1.PodPending
+ pod2.Status.Phase = v1.PodRunning
+ pod1.Spec.NodeName = "HOST1"
+ pod2.Spec.NodeName = "HOST1"
+ coordinator.updatePod(pod1, pod2)
+ node1 := coordinator.nodes.nodesMap["HOST1"]
+ assert.Equal(t, node1.occupied.Resources[constants.Memory].Value, int64(1000))
+ assert.Equal(t, node1.occupied.Resources[constants.CPU].Value, int64(500))
+
+ // pod state changed from running to pending,
Review comment:
Can pod state change from `Running` to `Pending`?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-yunikorn-k8shim] yangwwei commented on pull request #190: [YUNIKORN-410] Pod state change may cause incorrect update on SchedulerNode#occupied
Posted by GitBox <gi...@apache.org>.
yangwwei commented on pull request #190:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/190#issuecomment-691287098
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-yunikorn-k8shim] codecov[bot] edited a comment on pull request #190: [YUNIKORN-410] Pod state change may cause incorrect update on SchedulerNode#occupied
Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #190:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/190#issuecomment-690934990
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-yunikorn-k8shim] codecov[bot] edited a comment on pull request #190: [YUNIKORN-410] Pod state change may cause incorrect update on SchedulerNode#occupied
Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #190:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/190#issuecomment-690934990
# [Codecov](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/190?src=pr&el=h1) Report
> Merging [#190](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/190?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/commit/d093b0d550e4aff685fdd42ed166db0ec77d27dd?el=desc) will **decrease** coverage by `0.00%`.
> The diff coverage is `100.00%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/190/graphs/tree.svg?width=650&height=150&src=pr&token=LZImIuvleR)](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/190?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #190 +/- ##
==========================================
- Coverage 58.94% 58.94% -0.01%
==========================================
Files 33 33
Lines 3303 3305 +2
==========================================
+ Hits 1947 1948 +1
- Misses 1285 1286 +1
Partials 71 71
```
| [Impacted Files](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/190?src=pr&el=tree) | Coverage Δ | |
|---|---|---|
| [pkg/cache/node\_coordinator.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/190/diff?src=pr&el=tree#diff-cGtnL2NhY2hlL25vZGVfY29vcmRpbmF0b3IuZ28=) | `65.62% <100.00%> (+1.10%)` | :arrow_up: |
| [pkg/shim/scheduler\_mock.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/190/diff?src=pr&el=tree#diff-cGtnL3NoaW0vc2NoZWR1bGVyX21vY2suZ28=) | `85.00% <0.00%> (-0.84%)` | :arrow_down: |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/190?src=pr&el=continue).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/190?src=pr&el=footer). Last update [d093b0d...cd65189](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/190?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-yunikorn-k8shim] codecov[bot] commented on pull request #190: [YUNIKORN-410] Pod state change may cause incorrect update on SchedulerNode#occupied
Posted by GitBox <gi...@apache.org>.
codecov[bot] commented on pull request #190:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/190#issuecomment-690934990
# [Codecov](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/190?src=pr&el=h1) Report
> Merging [#190](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/190?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/commit/d093b0d550e4aff685fdd42ed166db0ec77d27dd?el=desc) will **decrease** coverage by `0.00%`.
> The diff coverage is `100.00%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/190/graphs/tree.svg?width=650&height=150&src=pr&token=LZImIuvleR)](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/190?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #190 +/- ##
==========================================
- Coverage 58.94% 58.94% -0.01%
==========================================
Files 33 33
Lines 3303 3305 +2
==========================================
+ Hits 1947 1948 +1
- Misses 1285 1286 +1
Partials 71 71
```
| [Impacted Files](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/190?src=pr&el=tree) | Coverage Δ | |
|---|---|---|
| [pkg/cache/node\_coordinator.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/190/diff?src=pr&el=tree#diff-cGtnL2NhY2hlL25vZGVfY29vcmRpbmF0b3IuZ28=) | `65.62% <100.00%> (+1.10%)` | :arrow_up: |
| [pkg/shim/scheduler\_mock.go](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/190/diff?src=pr&el=tree#diff-cGtnL3NoaW0vc2NoZWR1bGVyX21vY2suZ28=) | `85.00% <0.00%> (-0.84%)` | :arrow_down: |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/190?src=pr&el=continue).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/190?src=pr&el=footer). Last update [d093b0d...cd65189](https://codecov.io/gh/apache/incubator-yunikorn-k8shim/pull/190?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-yunikorn-k8shim] codecov[bot] commented on pull request #190: [YUNIKORN-410] Pod state change may cause incorrect update on SchedulerNode#occupied
Posted by GitBox <gi...@apache.org>.
codecov[bot] commented on pull request #190:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/190#issuecomment-690934990
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-yunikorn-k8shim] yangwwei commented on a change in pull request #190: [YUNIKORN-410] Pod state change may cause incorrect update on SchedulerNode#occupied
Posted by GitBox <gi...@apache.org>.
yangwwei commented on a change in pull request #190:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/190#discussion_r487263705
##########
File path: pkg/cache/node_coordinator_test.go
##########
@@ -248,3 +248,45 @@ func TestDeleteTerminatedPod(t *testing.T) {
coordinator.deletePod(pod2)
assert.Equal(t, executed, false)
}
+
+func TestUpdatePodPendingRunningAlternatively(t *testing.T) {
+ mockedSchedulerAPI := newMockSchedulerAPI()
+ nodes := newSchedulerNodes(mockedSchedulerAPI, NewTestSchedulerCache())
+ host1 := utils.NodeForTest("HOST1", "10G", "10")
+ nodes.addNode(host1)
+ coordinator := newNodeResourceCoordinator(nodes)
+
+ // init pod (pod1) and changed pod (pod2)
+ pod1 := utils.PodForTest("pod1", "1G", "500m")
+ pod1.SetUID("UID-01")
+ pod2 := utils.PodForTest("pod1", "1G", "500m")
+ pod2.SetUID("UID-01")
+
+ // pod state changed from pending to running, trigger an update
+ pod1.Status.Phase = v1.PodPending
+ pod2.Status.Phase = v1.PodRunning
+ pod1.Spec.NodeName = "HOST1"
+ pod2.Spec.NodeName = "HOST1"
+ coordinator.updatePod(pod1, pod2)
+ node1 := coordinator.nodes.nodesMap["HOST1"]
+ assert.Equal(t, node1.occupied.Resources[constants.Memory].Value, int64(1000))
+ assert.Equal(t, node1.occupied.Resources[constants.CPU].Value, int64(500))
+
+ // pod state changed from running to pending,
Review comment:
Can pod state change from `Running` to `Pending`?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-yunikorn-k8shim] wilfred-s commented on pull request #190: [YUNIKORN-410] Pod state change may cause incorrect update on SchedulerNode#occupied
Posted by GitBox <gi...@apache.org>.
wilfred-s commented on pull request #190:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/190#issuecomment-694640099
I echo the question from @yangwwei on the fact that a pod status should not be going back to pending.
Based on the response that is part of an old [kubernetes issue](https://github.com/kubernetes/kubernetes/issues/10319), and the [pod documentation](https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/) a pod should never go back to a pending state.
The status is a one way change from pending -> running -> (succeeded || failed)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-yunikorn-k8shim] yangwwei commented on pull request #190: [YUNIKORN-410] Pod state change may cause incorrect update on SchedulerNode#occupied
Posted by GitBox <gi...@apache.org>.
yangwwei commented on pull request #190:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/190#issuecomment-691287098
hi @TaoYang526
I am trying to understand how this could happen. In the code, we already did a check
```
oldPod.Status.Phase != newPod.Status.Phase
```
that means when we try to update the occupied resource, that means the pod is transiting from `Pending` to `Running`.
Pod phases are: `Pending`, `Running`, `Succeeded`, `Failed`, `Unknown`. Among them, `Succeed` and `Failed` are terminated state, which means they cannot transit to `Running` at all. So the only possibility is from `Pending` to `Running`, or `Unknown` to `Running`. The case you are seeing, is it from `Unknown` to `Running`?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-yunikorn-k8shim] codecov[bot] commented on pull request #190: [YUNIKORN-410] Pod state change may cause incorrect update on SchedulerNode#occupied
Posted by GitBox <gi...@apache.org>.
codecov[bot] commented on pull request #190:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/190#issuecomment-690934990
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-yunikorn-k8shim] yangwwei commented on a change in pull request #190: [YUNIKORN-410] Pod state change may cause incorrect update on SchedulerNode#occupied
Posted by GitBox <gi...@apache.org>.
yangwwei commented on a change in pull request #190:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/190#discussion_r487263705
##########
File path: pkg/cache/node_coordinator_test.go
##########
@@ -248,3 +248,45 @@ func TestDeleteTerminatedPod(t *testing.T) {
coordinator.deletePod(pod2)
assert.Equal(t, executed, false)
}
+
+func TestUpdatePodPendingRunningAlternatively(t *testing.T) {
+ mockedSchedulerAPI := newMockSchedulerAPI()
+ nodes := newSchedulerNodes(mockedSchedulerAPI, NewTestSchedulerCache())
+ host1 := utils.NodeForTest("HOST1", "10G", "10")
+ nodes.addNode(host1)
+ coordinator := newNodeResourceCoordinator(nodes)
+
+ // init pod (pod1) and changed pod (pod2)
+ pod1 := utils.PodForTest("pod1", "1G", "500m")
+ pod1.SetUID("UID-01")
+ pod2 := utils.PodForTest("pod1", "1G", "500m")
+ pod2.SetUID("UID-01")
+
+ // pod state changed from pending to running, trigger an update
+ pod1.Status.Phase = v1.PodPending
+ pod2.Status.Phase = v1.PodRunning
+ pod1.Spec.NodeName = "HOST1"
+ pod2.Spec.NodeName = "HOST1"
+ coordinator.updatePod(pod1, pod2)
+ node1 := coordinator.nodes.nodesMap["HOST1"]
+ assert.Equal(t, node1.occupied.Resources[constants.Memory].Value, int64(1000))
+ assert.Equal(t, node1.occupied.Resources[constants.CPU].Value, int64(500))
+
+ // pod state changed from running to pending,
Review comment:
Can pod state change from `Running` to `Pending`?
##########
File path: pkg/cache/node_coordinator_test.go
##########
@@ -248,3 +248,45 @@ func TestDeleteTerminatedPod(t *testing.T) {
coordinator.deletePod(pod2)
assert.Equal(t, executed, false)
}
+
+func TestUpdatePodPendingRunningAlternatively(t *testing.T) {
+ mockedSchedulerAPI := newMockSchedulerAPI()
+ nodes := newSchedulerNodes(mockedSchedulerAPI, NewTestSchedulerCache())
+ host1 := utils.NodeForTest("HOST1", "10G", "10")
+ nodes.addNode(host1)
+ coordinator := newNodeResourceCoordinator(nodes)
+
+ // init pod (pod1) and changed pod (pod2)
+ pod1 := utils.PodForTest("pod1", "1G", "500m")
+ pod1.SetUID("UID-01")
+ pod2 := utils.PodForTest("pod1", "1G", "500m")
+ pod2.SetUID("UID-01")
+
+ // pod state changed from pending to running, trigger an update
+ pod1.Status.Phase = v1.PodPending
+ pod2.Status.Phase = v1.PodRunning
+ pod1.Spec.NodeName = "HOST1"
+ pod2.Spec.NodeName = "HOST1"
+ coordinator.updatePod(pod1, pod2)
+ node1 := coordinator.nodes.nodesMap["HOST1"]
+ assert.Equal(t, node1.occupied.Resources[constants.Memory].Value, int64(1000))
+ assert.Equal(t, node1.occupied.Resources[constants.CPU].Value, int64(500))
+
+ // pod state changed from running to pending,
Review comment:
Can pod state change from `Running` to `Pending`?
##########
File path: pkg/cache/node_coordinator_test.go
##########
@@ -248,3 +248,45 @@ func TestDeleteTerminatedPod(t *testing.T) {
coordinator.deletePod(pod2)
assert.Equal(t, executed, false)
}
+
+func TestUpdatePodPendingRunningAlternatively(t *testing.T) {
+ mockedSchedulerAPI := newMockSchedulerAPI()
+ nodes := newSchedulerNodes(mockedSchedulerAPI, NewTestSchedulerCache())
+ host1 := utils.NodeForTest("HOST1", "10G", "10")
+ nodes.addNode(host1)
+ coordinator := newNodeResourceCoordinator(nodes)
+
+ // init pod (pod1) and changed pod (pod2)
+ pod1 := utils.PodForTest("pod1", "1G", "500m")
+ pod1.SetUID("UID-01")
+ pod2 := utils.PodForTest("pod1", "1G", "500m")
+ pod2.SetUID("UID-01")
+
+ // pod state changed from pending to running, trigger an update
+ pod1.Status.Phase = v1.PodPending
+ pod2.Status.Phase = v1.PodRunning
+ pod1.Spec.NodeName = "HOST1"
+ pod2.Spec.NodeName = "HOST1"
+ coordinator.updatePod(pod1, pod2)
+ node1 := coordinator.nodes.nodesMap["HOST1"]
+ assert.Equal(t, node1.occupied.Resources[constants.Memory].Value, int64(1000))
+ assert.Equal(t, node1.occupied.Resources[constants.CPU].Value, int64(500))
+
+ // pod state changed from running to pending,
Review comment:
Can pod state change from `Running` to `Pending`?
##########
File path: pkg/cache/node_coordinator_test.go
##########
@@ -248,3 +248,45 @@ func TestDeleteTerminatedPod(t *testing.T) {
coordinator.deletePod(pod2)
assert.Equal(t, executed, false)
}
+
+func TestUpdatePodPendingRunningAlternatively(t *testing.T) {
+ mockedSchedulerAPI := newMockSchedulerAPI()
+ nodes := newSchedulerNodes(mockedSchedulerAPI, NewTestSchedulerCache())
+ host1 := utils.NodeForTest("HOST1", "10G", "10")
+ nodes.addNode(host1)
+ coordinator := newNodeResourceCoordinator(nodes)
+
+ // init pod (pod1) and changed pod (pod2)
+ pod1 := utils.PodForTest("pod1", "1G", "500m")
+ pod1.SetUID("UID-01")
+ pod2 := utils.PodForTest("pod1", "1G", "500m")
+ pod2.SetUID("UID-01")
+
+ // pod state changed from pending to running, trigger an update
+ pod1.Status.Phase = v1.PodPending
+ pod2.Status.Phase = v1.PodRunning
+ pod1.Spec.NodeName = "HOST1"
+ pod2.Spec.NodeName = "HOST1"
+ coordinator.updatePod(pod1, pod2)
+ node1 := coordinator.nodes.nodesMap["HOST1"]
+ assert.Equal(t, node1.occupied.Resources[constants.Memory].Value, int64(1000))
+ assert.Equal(t, node1.occupied.Resources[constants.CPU].Value, int64(500))
+
+ // pod state changed from running to pending,
Review comment:
Can pod state change from `Running` to `Pending`?
##########
File path: pkg/cache/node_coordinator_test.go
##########
@@ -248,3 +248,45 @@ func TestDeleteTerminatedPod(t *testing.T) {
coordinator.deletePod(pod2)
assert.Equal(t, executed, false)
}
+
+func TestUpdatePodPendingRunningAlternatively(t *testing.T) {
+ mockedSchedulerAPI := newMockSchedulerAPI()
+ nodes := newSchedulerNodes(mockedSchedulerAPI, NewTestSchedulerCache())
+ host1 := utils.NodeForTest("HOST1", "10G", "10")
+ nodes.addNode(host1)
+ coordinator := newNodeResourceCoordinator(nodes)
+
+ // init pod (pod1) and changed pod (pod2)
+ pod1 := utils.PodForTest("pod1", "1G", "500m")
+ pod1.SetUID("UID-01")
+ pod2 := utils.PodForTest("pod1", "1G", "500m")
+ pod2.SetUID("UID-01")
+
+ // pod state changed from pending to running, trigger an update
+ pod1.Status.Phase = v1.PodPending
+ pod2.Status.Phase = v1.PodRunning
+ pod1.Spec.NodeName = "HOST1"
+ pod2.Spec.NodeName = "HOST1"
+ coordinator.updatePod(pod1, pod2)
+ node1 := coordinator.nodes.nodesMap["HOST1"]
+ assert.Equal(t, node1.occupied.Resources[constants.Memory].Value, int64(1000))
+ assert.Equal(t, node1.occupied.Resources[constants.CPU].Value, int64(500))
+
+ // pod state changed from running to pending,
Review comment:
Can pod state change from `Running` to `Pending`?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-yunikorn-k8shim] yangwwei commented on pull request #190: [YUNIKORN-410] Pod state change may cause incorrect update on SchedulerNode#occupied
Posted by GitBox <gi...@apache.org>.
yangwwei commented on pull request #190:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/190#issuecomment-691287098
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-yunikorn-k8shim] yangwwei closed pull request #190: [YUNIKORN-410] Pod state change may cause incorrect update on SchedulerNode#occupied
Posted by GitBox <gi...@apache.org>.
yangwwei closed pull request #190:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/190
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-yunikorn-k8shim] yangwwei commented on pull request #190: [YUNIKORN-410] Pod state change may cause incorrect update on SchedulerNode#occupied
Posted by GitBox <gi...@apache.org>.
yangwwei commented on pull request #190:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/190#issuecomment-691287098
hi @TaoYang526
I am trying to understand how this could happen. In the code, we already did a check
```
oldPod.Status.Phase != newPod.Status.Phase
```
that means when we try to update the occupied resource, that means the pod is transiting from `Pending` to `Running`.
Pod phases are: `Pending`, `Running`, `Succeeded`, `Failed`, `Unknown`. Among them, `Succeed` and `Failed` are terminated state, which means they cannot transit to `Running` at all. So the only possibility is from `Pending` to `Running`, or `Unknown` to `Running`. The case you are seeing, is it from `Unknown` to `Running`?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-yunikorn-k8shim] yangwwei commented on a change in pull request #190: [YUNIKORN-410] Pod state change may cause incorrect update on SchedulerNode#occupied
Posted by GitBox <gi...@apache.org>.
yangwwei commented on a change in pull request #190:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/190#discussion_r487263705
##########
File path: pkg/cache/node_coordinator_test.go
##########
@@ -248,3 +248,45 @@ func TestDeleteTerminatedPod(t *testing.T) {
coordinator.deletePod(pod2)
assert.Equal(t, executed, false)
}
+
+func TestUpdatePodPendingRunningAlternatively(t *testing.T) {
+ mockedSchedulerAPI := newMockSchedulerAPI()
+ nodes := newSchedulerNodes(mockedSchedulerAPI, NewTestSchedulerCache())
+ host1 := utils.NodeForTest("HOST1", "10G", "10")
+ nodes.addNode(host1)
+ coordinator := newNodeResourceCoordinator(nodes)
+
+ // init pod (pod1) and changed pod (pod2)
+ pod1 := utils.PodForTest("pod1", "1G", "500m")
+ pod1.SetUID("UID-01")
+ pod2 := utils.PodForTest("pod1", "1G", "500m")
+ pod2.SetUID("UID-01")
+
+ // pod state changed from pending to running, trigger an update
+ pod1.Status.Phase = v1.PodPending
+ pod2.Status.Phase = v1.PodRunning
+ pod1.Spec.NodeName = "HOST1"
+ pod2.Spec.NodeName = "HOST1"
+ coordinator.updatePod(pod1, pod2)
+ node1 := coordinator.nodes.nodesMap["HOST1"]
+ assert.Equal(t, node1.occupied.Resources[constants.Memory].Value, int64(1000))
+ assert.Equal(t, node1.occupied.Resources[constants.CPU].Value, int64(500))
+
+ // pod state changed from running to pending,
Review comment:
Can pod state change from `Running` to `Pending`?
##########
File path: pkg/cache/node_coordinator_test.go
##########
@@ -248,3 +248,45 @@ func TestDeleteTerminatedPod(t *testing.T) {
coordinator.deletePod(pod2)
assert.Equal(t, executed, false)
}
+
+func TestUpdatePodPendingRunningAlternatively(t *testing.T) {
+ mockedSchedulerAPI := newMockSchedulerAPI()
+ nodes := newSchedulerNodes(mockedSchedulerAPI, NewTestSchedulerCache())
+ host1 := utils.NodeForTest("HOST1", "10G", "10")
+ nodes.addNode(host1)
+ coordinator := newNodeResourceCoordinator(nodes)
+
+ // init pod (pod1) and changed pod (pod2)
+ pod1 := utils.PodForTest("pod1", "1G", "500m")
+ pod1.SetUID("UID-01")
+ pod2 := utils.PodForTest("pod1", "1G", "500m")
+ pod2.SetUID("UID-01")
+
+ // pod state changed from pending to running, trigger an update
+ pod1.Status.Phase = v1.PodPending
+ pod2.Status.Phase = v1.PodRunning
+ pod1.Spec.NodeName = "HOST1"
+ pod2.Spec.NodeName = "HOST1"
+ coordinator.updatePod(pod1, pod2)
+ node1 := coordinator.nodes.nodesMap["HOST1"]
+ assert.Equal(t, node1.occupied.Resources[constants.Memory].Value, int64(1000))
+ assert.Equal(t, node1.occupied.Resources[constants.CPU].Value, int64(500))
+
+ // pod state changed from running to pending,
Review comment:
Can pod state change from `Running` to `Pending`?
##########
File path: pkg/cache/node_coordinator_test.go
##########
@@ -248,3 +248,45 @@ func TestDeleteTerminatedPod(t *testing.T) {
coordinator.deletePod(pod2)
assert.Equal(t, executed, false)
}
+
+func TestUpdatePodPendingRunningAlternatively(t *testing.T) {
+ mockedSchedulerAPI := newMockSchedulerAPI()
+ nodes := newSchedulerNodes(mockedSchedulerAPI, NewTestSchedulerCache())
+ host1 := utils.NodeForTest("HOST1", "10G", "10")
+ nodes.addNode(host1)
+ coordinator := newNodeResourceCoordinator(nodes)
+
+ // init pod (pod1) and changed pod (pod2)
+ pod1 := utils.PodForTest("pod1", "1G", "500m")
+ pod1.SetUID("UID-01")
+ pod2 := utils.PodForTest("pod1", "1G", "500m")
+ pod2.SetUID("UID-01")
+
+ // pod state changed from pending to running, trigger an update
+ pod1.Status.Phase = v1.PodPending
+ pod2.Status.Phase = v1.PodRunning
+ pod1.Spec.NodeName = "HOST1"
+ pod2.Spec.NodeName = "HOST1"
+ coordinator.updatePod(pod1, pod2)
+ node1 := coordinator.nodes.nodesMap["HOST1"]
+ assert.Equal(t, node1.occupied.Resources[constants.Memory].Value, int64(1000))
+ assert.Equal(t, node1.occupied.Resources[constants.CPU].Value, int64(500))
+
+ // pod state changed from running to pending,
Review comment:
Can pod state change from `Running` to `Pending`?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-yunikorn-k8shim] codecov[bot] edited a comment on pull request #190: [YUNIKORN-410] Pod state change may cause incorrect update on SchedulerNode#occupied
Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #190:
URL: https://github.com/apache/incubator-yunikorn-k8shim/pull/190#issuecomment-690934990
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org