You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/10/01 17:29:13 UTC

[GitHub] [spark] huskysun opened a new pull request #29924: [SPARK-30821][K8S]Handle executor failure with multiple containers

huskysun opened a new pull request #29924:
URL: https://github.com/apache/spark/pull/29924


   Handle executor failure with multiple containers
   
   Added a spark property spark.kubernetes.executor.checkAllContainers,
   with default being false. When it's true, the executor snapshot will
   take all containers in the executor into consideration when deciding
   whether the executor is in "Running" state, if the pod restart policy is
   "Never". Also, added the new spark property to the doc.
   
   <!--
   Thanks for sending a pull request!  Here are some tips for you:
     1. If this is your first time, please read our contributor guidelines: https://spark.apache.org/contributing.html
     2. Ensure you have added or run the appropriate tests for your PR: https://spark.apache.org/developer-tools.html
     3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][SPARK-XXXX] Your PR title ...'.
     4. Be sure to keep the PR description updated to reflect all changes.
     5. Please write your PR title to summarize what this PR proposes.
     6. If possible, provide a concise example to reproduce the issue for a faster review.
     7. If you want to add a new configuration, please read the guideline first for naming configurations in
        'core/src/main/scala/org/apache/spark/internal/config/ConfigEntry.scala'.
   -->
   
   ### What changes were proposed in this pull request?
   <!--
   Please clarify what changes you are proposing. The purpose of this section is to outline the changes and how this PR fixes the issue. 
   If possible, please consider writing useful notes for better and faster reviews in your PR. See the examples below.
     1. If you refactor some codes with changing classes, showing the class hierarchy will help reviewers.
     2. If you fix some SQL features, you can provide some references of other DBMSes.
     3. If there is design documentation, please add the link.
     4. If there is a discussion in the mailing list, please add the link.
   -->
   Checking of all containers in the executor pod when reporting executor status, if the `spark.kubernetes.executor.checkAllContainers` property is set to true.
   
   ### Why are the changes needed?
   <!--
   Please clarify why the changes are needed. For instance,
     1. If you propose a new API, clarify the use case for a new API.
     2. If you fix a bug, you can clarify why it is a bug.
   -->
   Currently, a pod remains "running" as long as there is at least one running container. This prevents Spark from noticing when a container has failed in an executor pod with multiple containers. With this change, user can configure the behavior to be different. Namely, if any container in the executor pod has failed, either the executor process or one of its sidecars, the pod is considered to be failed, and it will be rescheduled.
   
   ### Does this PR introduce _any_ user-facing change?
   <!--
   Note that it means *any* user-facing change including all aspects such as the documentation fix.
   If yes, please clarify the previous behavior and the change this PR proposes - provide the console output, description and/or an example to show the behavior difference if possible.
   If possible, please also clarify if this is a user-facing change compared to the released Spark versions or within the unreleased branches such as master.
   If no, write 'No'.
   -->
   Yes, new spark property added.
   User is now able to choose whether to turn on this feature using the `spark.kubernetes.executor.checkAllContainers` property.
   
   ### How was this patch tested?
   <!--
   If tests were added, say they were added here. Please make sure to add some test cases that check the changes thoroughly including negative and positive cases if possible.
   If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future.
   If tests were not added, please describe why they were not added and/or why it was difficult to add.
   -->
   Unit test was added and all passed.
   I tried to run integration test by following the instruction [here](https://spark.apache.org/developer-tools.html) (section "Testing K8S") and also [here](https://github.com/apache/spark/blob/master/resource-managers/kubernetes/integration-tests/README.md), but I wasn't able to run it smoothly as it fails to talk with minikube cluster. Maybe it's because my minikube version is too new (I'm using v1.13.1)...? Since I've been trying it for two days and still can't make it work, I decided to submit this PR and hopefully the Jenkins test will pass.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] huskysun commented on pull request #29924: [SPARK-30821][K8S]Handle executor failure with multiple containers

Posted by GitBox <gi...@apache.org>.
huskysun commented on pull request #29924:
URL: https://github.com/apache/spark/pull/29924#issuecomment-702288016


   This PR is a continuation of the effort for https://github.com/apache/spark/pull/27568.
   @holdenk Please take a look, as you said you're interested in merging this. (Sorry for taking this long to submit the PR :sweat_smile: )
   cc @gongx 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29924: [SPARK-30821][K8S]Handle executor failure with multiple containers

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29924:
URL: https://github.com/apache/spark/pull/29924#issuecomment-715578994


   **[Test build #130215 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130215/testReport)** for PR 29924 at commit [`a4958b9`](https://github.com/apache/spark/commit/a4958b9b99066e2c23c8cb5ba54b2cb5a9ae9a4d).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] huskysun commented on pull request #29924: [SPARK-30821][K8S]Handle executor failure with multiple containers

Posted by GitBox <gi...@apache.org>.
huskysun commented on pull request #29924:
URL: https://github.com/apache/spark/pull/29924#issuecomment-707959738


   Also is Jenkins triggered? Looks like the bot didn't make any comment to this PR.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] huskysun commented on pull request #29924: [SPARK-30821][K8S]Handle executor failure with multiple containers

Posted by GitBox <gi...@apache.org>.
huskysun commented on pull request #29924:
URL: https://github.com/apache/spark/pull/29924#issuecomment-715493273


   Hi @holdenk sorry for keeping bugging you. Any updates on this? Is there any further work needed behind the curtain (like release cadence or something) that prevents this from being merged? Thanks!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29924: [SPARK-30821][K8S]Handle executor failure with multiple containers

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29924:
URL: https://github.com/apache/spark/pull/29924#issuecomment-702286036


   Can one of the admins verify this patch?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29924: [SPARK-30821][K8S]Handle executor failure with multiple containers

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29924:
URL: https://github.com/apache/spark/pull/29924#issuecomment-708661853


   **[Test build #129762 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129762/testReport)** for PR 29924 at commit [`a4958b9`](https://github.com/apache/spark/commit/a4958b9b99066e2c23c8cb5ba54b2cb5a9ae9a4d).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29924: [SPARK-30821][K8S]Handle executor failure with multiple containers

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29924:
URL: https://github.com/apache/spark/pull/29924#issuecomment-715578994


   **[Test build #130215 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130215/testReport)** for PR 29924 at commit [`a4958b9`](https://github.com/apache/spark/commit/a4958b9b99066e2c23c8cb5ba54b2cb5a9ae9a4d).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] holdenk commented on pull request #29924: [SPARK-30821][K8S]Handle executor failure with multiple containers

Posted by GitBox <gi...@apache.org>.
holdenk commented on pull request #29924:
URL: https://github.com/apache/spark/pull/29924#issuecomment-708581852


   huh weird. Jenkins test this please.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29924: [SPARK-30821][K8S]Handle executor failure with multiple containers

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29924:
URL: https://github.com/apache/spark/pull/29924#issuecomment-702286766


   Can one of the admins verify this patch?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29924: [SPARK-30821][K8S]Handle executor failure with multiple containers

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29924:
URL: https://github.com/apache/spark/pull/29924#issuecomment-702286036


   Can one of the admins verify this patch?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] holdenk commented on a change in pull request #29924: [SPARK-30821][K8S]Handle executor failure with multiple containers

Posted by GitBox <gi...@apache.org>.
holdenk commented on a change in pull request #29924:
URL: https://github.com/apache/spark/pull/29924#discussion_r504163939



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsSnapshot.scala
##########
@@ -59,11 +65,19 @@ object ExecutorPodsSnapshot extends Logging {
         case "pending" =>
           PodPending(pod)
         case "running" =>
-          PodRunning(pod)
+          if (shouldCheckAllContainers &&
+            "Never" == pod.getSpec.getRestartPolicy &&
+            pod.getStatus.getContainerStatuses.stream
+              .map[ContainerStateTerminated](cs => cs.getState.getTerminated)
+              .anyMatch(t => t != null && t.getExitCode != 0)) {
+            PodFailed(pod)
+          } else {
+            PodRunning(pod)
+          }
         case "failed" =>
           PodFailed(pod)
         case "succeeded" =>
-          PodSucceeded(pod)
+            PodSucceeded(pod)

Review comment:
       nit: indentation should match.

##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsSnapshot.scala
##########
@@ -59,11 +65,19 @@ object ExecutorPodsSnapshot extends Logging {
         case "pending" =>
           PodPending(pod)
         case "running" =>
-          PodRunning(pod)
+          if (shouldCheckAllContainers &&
+            "Never" == pod.getSpec.getRestartPolicy &&

Review comment:
       Good addition :+1: 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] huskysun commented on pull request #29924: [SPARK-30821][K8S]Handle executor failure with multiple containers

Posted by GitBox <gi...@apache.org>.
huskysun commented on pull request #29924:
URL: https://github.com/apache/spark/pull/29924#issuecomment-707955643


   @holdenk Made the fix about the indentation. Please take a look again. Thanks!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gongx commented on pull request #29924: [SPARK-30821][K8S]Handle executor failure with multiple containers

Posted by GitBox <gi...@apache.org>.
gongx commented on pull request #29924:
URL: https://github.com/apache/spark/pull/29924#issuecomment-707919371


   @holdenk Could you review this PR, please?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29924: [SPARK-30821][K8S]Handle executor failure with multiple containers

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29924:
URL: https://github.com/apache/spark/pull/29924#issuecomment-715605602






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29924: [SPARK-30821][K8S]Handle executor failure with multiple containers

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29924:
URL: https://github.com/apache/spark/pull/29924#issuecomment-708718354






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29924: [SPARK-30821][K8S]Handle executor failure with multiple containers

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29924:
URL: https://github.com/apache/spark/pull/29924#issuecomment-715583662






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gongx commented on pull request #29924: [SPARK-30821][K8S]Handle executor failure with multiple containers

Posted by GitBox <gi...@apache.org>.
gongx commented on pull request #29924:
URL: https://github.com/apache/spark/pull/29924#issuecomment-708589011


   @holdenk Thank you for the help. I do see that the Jenkins status is at "Asked to test", but it does not make any progress.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] huskysun commented on a change in pull request #29924: [SPARK-30821][K8S]Handle executor failure with multiple containers

Posted by GitBox <gi...@apache.org>.
huskysun commented on a change in pull request #29924:
URL: https://github.com/apache/spark/pull/29924#discussion_r504198638



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsSnapshot.scala
##########
@@ -59,11 +65,19 @@ object ExecutorPodsSnapshot extends Logging {
         case "pending" =>
           PodPending(pod)
         case "running" =>
-          PodRunning(pod)
+          if (shouldCheckAllContainers &&
+            "Never" == pod.getSpec.getRestartPolicy &&
+            pod.getStatus.getContainerStatuses.stream
+              .map[ContainerStateTerminated](cs => cs.getState.getTerminated)
+              .anyMatch(t => t != null && t.getExitCode != 0)) {
+            PodFailed(pod)
+          } else {
+            PodRunning(pod)
+          }
         case "failed" =>
           PodFailed(pod)
         case "succeeded" =>
-          PodSucceeded(pod)
+            PodSucceeded(pod)

Review comment:
       Sorry about that, just fixed it




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] asfgit closed pull request #29924: [SPARK-30821][K8S]Handle executor failure with multiple containers

Posted by GitBox <gi...@apache.org>.
asfgit closed pull request #29924:
URL: https://github.com/apache/spark/pull/29924


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29924: [SPARK-30821][K8S]Handle executor failure with multiple containers

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29924:
URL: https://github.com/apache/spark/pull/29924#issuecomment-715595643


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34815/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gongx commented on pull request #29924: [SPARK-30821][K8S]Handle executor failure with multiple containers

Posted by GitBox <gi...@apache.org>.
gongx commented on pull request #29924:
URL: https://github.com/apache/spark/pull/29924#issuecomment-710649168


   @holdenk Thank you for the review. Please merge it if you get time.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #29924: [SPARK-30821][K8S]Handle executor failure with multiple containers

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #29924:
URL: https://github.com/apache/spark/pull/29924#issuecomment-715577602


   Retest this please.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29924: [SPARK-30821][K8S]Handle executor failure with multiple containers

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29924:
URL: https://github.com/apache/spark/pull/29924#issuecomment-715583662






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] holdenk commented on pull request #29924: [SPARK-30821][K8S]Handle executor failure with multiple containers

Posted by GitBox <gi...@apache.org>.
holdenk commented on pull request #29924:
URL: https://github.com/apache/spark/pull/29924#issuecomment-707990107


   Jenkins OK to test


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29924: [SPARK-30821][K8S]Handle executor failure with multiple containers

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29924:
URL: https://github.com/apache/spark/pull/29924#issuecomment-702286766


   Can one of the admins verify this patch?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29924: [SPARK-30821][K8S]Handle executor failure with multiple containers

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29924:
URL: https://github.com/apache/spark/pull/29924#issuecomment-708713241


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34368/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29924: [SPARK-30821][K8S]Handle executor failure with multiple containers

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29924:
URL: https://github.com/apache/spark/pull/29924#issuecomment-715583511


   **[Test build #130215 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130215/testReport)** for PR 29924 at commit [`a4958b9`](https://github.com/apache/spark/commit/a4958b9b99066e2c23c8cb5ba54b2cb5a9ae9a4d).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29924: [SPARK-30821][K8S]Handle executor failure with multiple containers

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29924:
URL: https://github.com/apache/spark/pull/29924#issuecomment-708718332


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34368/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29924: [SPARK-30821][K8S]Handle executor failure with multiple containers

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29924:
URL: https://github.com/apache/spark/pull/29924#issuecomment-715605602






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gongx commented on pull request #29924: [SPARK-30821][K8S]Handle executor failure with multiple containers

Posted by GitBox <gi...@apache.org>.
gongx commented on pull request #29924:
URL: https://github.com/apache/spark/pull/29924#issuecomment-713013865


   @holdenk Please help us to merge this change if you have time. Thank you.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29924: [SPARK-30821][K8S]Handle executor failure with multiple containers

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29924:
URL: https://github.com/apache/spark/pull/29924#issuecomment-708661853


   **[Test build #129762 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129762/testReport)** for PR 29924 at commit [`a4958b9`](https://github.com/apache/spark/commit/a4958b9b99066e2c23c8cb5ba54b2cb5a9ae9a4d).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] holdenk commented on pull request #29924: [SPARK-30821][K8S]Handle executor failure with multiple containers

Posted by GitBox <gi...@apache.org>.
holdenk commented on pull request #29924:
URL: https://github.com/apache/spark/pull/29924#issuecomment-708584220


   cc @SparkQA 
   Jenkins OK to test


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29924: [SPARK-30821][K8S]Handle executor failure with multiple containers

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29924:
URL: https://github.com/apache/spark/pull/29924#issuecomment-715605582


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34815/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] holdenk commented on pull request #29924: [SPARK-30821][K8S]Handle executor failure with multiple containers

Posted by GitBox <gi...@apache.org>.
holdenk commented on pull request #29924:
URL: https://github.com/apache/spark/pull/29924#issuecomment-715997264


   @huskysun what is your JIRA account handle?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29924: [SPARK-30821][K8S]Handle executor failure with multiple containers

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29924:
URL: https://github.com/apache/spark/pull/29924#issuecomment-708670609


   **[Test build #129762 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129762/testReport)** for PR 29924 at commit [`a4958b9`](https://github.com/apache/spark/commit/a4958b9b99066e2c23c8cb5ba54b2cb5a9ae9a4d).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] huskysun commented on pull request #29924: [SPARK-30821][K8S]Handle executor failure with multiple containers

Posted by GitBox <gi...@apache.org>.
huskysun commented on pull request #29924:
URL: https://github.com/apache/spark/pull/29924#issuecomment-709490243


   Thank you @holdenk!
   Also thanks @khogeland for starting on this effort and @gongx for the help on coordination.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29924: [SPARK-30821][K8S]Handle executor failure with multiple containers

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29924:
URL: https://github.com/apache/spark/pull/29924#issuecomment-708670851






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] huskysun commented on pull request #29924: [SPARK-30821][K8S]Handle executor failure with multiple containers

Posted by GitBox <gi...@apache.org>.
huskysun commented on pull request #29924:
URL: https://github.com/apache/spark/pull/29924#issuecomment-716084244


   Thanks for merging it in! I just created a JIRA account, and my username is
   "huskysun" and full name is "Shiqi Sun". Let me know if these work, thanks.
   
   On Sat, Oct 24, 2020 at 10:01 AM Holden Karau <no...@github.com>
   wrote:
   
   > @huskysun <https://github.com/huskysun> what is your JIRA account handle?
   >
   > —
   > You are receiving this because you were mentioned.
   > Reply to this email directly, view it on GitHub
   > <https://github.com/apache/spark/pull/29924#issuecomment-715997264>, or
   > unsubscribe
   > <https://github.com/notifications/unsubscribe-auth/AFON6VOGD3WQKW6EEYXLYHLSMMB65ANCNFSM4SAW6Z5A>
   > .
   >
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29924: [SPARK-30821][K8S]Handle executor failure with multiple containers

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29924:
URL: https://github.com/apache/spark/pull/29924#issuecomment-708718354






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29924: [SPARK-30821][K8S]Handle executor failure with multiple containers

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29924:
URL: https://github.com/apache/spark/pull/29924#issuecomment-708670851






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] holdenk commented on pull request #29924: [SPARK-30821][K8S]Handle executor failure with multiple containers

Posted by GitBox <gi...@apache.org>.
holdenk commented on pull request #29924:
URL: https://github.com/apache/spark/pull/29924#issuecomment-715994601


   Meregd and backported as a fix to branch-3.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] holdenk commented on pull request #29924: [SPARK-30821][K8S]Handle executor failure with multiple containers

Posted by GitBox <gi...@apache.org>.
holdenk commented on pull request #29924:
URL: https://github.com/apache/spark/pull/29924#issuecomment-708587061


   Filed a ticket to @shaneknapp ( https://issues.apache.org/jira/browse/SPARK-33151 ) since it seems like Jenkins might just be stuck.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org