You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by ho...@apache.org on 2020/12/11 22:45:23 UTC
[spark] branch master updated: [SPARK-33716][K8S] Fix potential
race condition during pod termination
This is an automated email from the ASF dual-hosted git repository.
holden pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new 15486fa [SPARK-33716][K8S] Fix potential race condition during pod termination
15486fa is described below
commit 15486fa970aa104e285cae0379a110f3795f3eaa
Author: Holden Karau <hk...@apple.com>
AuthorDate: Fri Dec 11 14:43:57 2020 -0800
[SPARK-33716][K8S] Fix potential race condition during pod termination
### What changes were proposed in this pull request?
Check that the pod state is not pending or running even if there is a deletion timestamp.
### Why are the changes needed?
This can occur when the pod state and deletion timestamp are not updated by etcd in sync & we get a pod snapshot during an inconsistent view.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Manual testing with local version of Minikube on an overloaded computer that caused out of sync updates.
Closes #30693 from holdenk/SPARK-33716-decommissioning-race-condition-during-pod-snapshot.
Authored-by: Holden Karau <hk...@apple.com>
Signed-off-by: Holden Karau <hk...@apple.com>
---
.../org/apache/spark/scheduler/cluster/k8s/ExecutorPodsSnapshot.scala | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsSnapshot.scala b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsSnapshot.scala
index be75311..e81d213 100644
--- a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsSnapshot.scala
+++ b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsSnapshot.scala
@@ -93,7 +93,8 @@ object ExecutorPodsSnapshot extends Logging {
(
pod.getStatus == null ||
pod.getStatus.getPhase == null ||
- pod.getStatus.getPhase.toLowerCase(Locale.ROOT) != "terminating"
+ (pod.getStatus.getPhase.toLowerCase(Locale.ROOT) != "terminating" &&
+ pod.getStatus.getPhase.toLowerCase(Locale.ROOT) != "running")
))
}
}
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org