You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/12/08 05:51:22 UTC

[GitHub] [spark] dongjoon-hyun opened a new pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

dongjoon-hyun opened a new pull request #34832:
URL: https://github.com/apache/spark/pull/34832


   ### What changes were proposed in this pull request?
   
   This PR aims to add a built-in plugin for K8s executor rolling decommission via the following.
   
   ```
   spark-3.3.0-SNAPSHOT-bin-3.3.1/bin/spark-submit \
   --master k8s://https://kubernetes.docker.internal:6443 \
   --deploy-mode cluster \
   -c spark.decommission.enabled=true \
   -c spark.plugins=org.apache.spark.scheduler.cluster.k8s.ExecutorRollPlugin \
   -c spark.kubernetes.executor.rollInterval=60 \
   -c spark.executor.instances=2 \
   -c spark.kubernetes.container.image=spark:latest \
   --class org.apache.spark.examples.SparkPi \
   local:///opt/spark/examples/jars/spark-examples_2.12-3.3.0-SNAPSHOT.jar 200000
   ```
   
   ### Why are the changes needed?
   
   This built-in plug-in is helpful when we want to refresh the long-lived executors to new ones.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No. This is a new feature.
   
   ### How was this patch tested?
   
   Pass the K8s IT test.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988589694


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/146002/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988635794


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50478/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988905344


   Thank you, @HyukjinKwon . I addressed your comments.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988532885


   **[Test build #145996 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145996/testReport)** for PR 34832 at commit [`4bc0851`](https://github.com/apache/spark/commit/4bc0851240edbccf7848d561d2a67f3275516021).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds the following public classes _(experimental)_:
     * `class ExecutorRollPlugin extends SparkPlugin with Logging `


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #34832:
URL: https://github.com/apache/spark/pull/34832#discussion_r764721127



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorRollPlugin.scala
##########
@@ -0,0 +1,100 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.scheduler.cluster.k8s
+
+import java.util.{Map => JMap}
+import java.util.concurrent.{ScheduledExecutorService, TimeUnit}
+
+import scala.collection.JavaConverters._
+
+import org.apache.spark.SparkContext
+import org.apache.spark.api.plugin.{DriverPlugin, ExecutorPlugin, PluginContext, SparkPlugin}
+import org.apache.spark.deploy.k8s.Config.EXECUTOR_ROLL_INTERVAL
+import org.apache.spark.internal.Logging
+import org.apache.spark.internal.config.DECOMMISSION_ENABLED
+import org.apache.spark.scheduler.ExecutorDecommissionInfo
+import org.apache.spark.util.ThreadUtils
+
+/**
+ * Spark plugin to roll executor pods periodically.
+ * This is independent from ExecutorPodsAllocator and aims to decommission executors
+ * one by one in both static and dynamic allocation.
+ *
+ * To use this plugin, we assume that a user has the required maximum number of executors + 1
+ * in both static and dynamic allocation configurations.
+ */
+class ExecutorRollPlugin extends SparkPlugin with Logging {
+  override def driverPlugin(): DriverPlugin = {
+    new DriverPlugin() {
+      private var sparkContext: SparkContext = _
+
+      private val periodicService: ScheduledExecutorService =
+        ThreadUtils.newDaemonSingleThreadScheduledExecutor("executor-roller")
+
+      override def init(sc: SparkContext, ctx: PluginContext): JMap[String, String] = {
+        val interval = sc.conf.get(EXECUTOR_ROLL_INTERVAL)
+        if (interval <= 0) {
+          logWarning(s"Disabled due to invalid interval value, '$interval'")
+        } else if (!sc.conf.get(DECOMMISSION_ENABLED)) {
+          logWarning("Disabled because ${DECOMMISSION_ENABLED} is false.")
+        } else {
+          // Scheduler is not created yet
+          sparkContext = sc
+
+          periodicService.scheduleAtFixedRate(() => {
+            try {
+              sparkContext.schedulerBackend match {
+                case scheduler: KubernetesClusterSchedulerBackend =>
+                  // Roughly assume that the smallest ID executor is the most long-lived one.
+                  val smallestID = scheduler
+                    .getExecutorIds()
+                    .filterNot(_.equals(SparkContext.DRIVER_IDENTIFIER))
+                    .map(_.toInt)
+                    .sorted
+                    .headOption
+                  smallestID match {
+                    case Some(id) =>
+                      // Use decommission to be safe.
+                      logInfo(s"Ask to decommission executor $id")
+                      val now = System.currentTimeMillis()
+                      scheduler.decommissionExecutor(
+                        id.toString,
+                        ExecutorDecommissionInfo(s"Rolling at $now"),
+                        adjustTargetNumExecutors = false)
+                    case _ =>
+                      logInfo("There is nothing to roll.")
+                  }
+                case _ =>
+                  logWarning("This plugin expects KubernetesClusterSchedulerBackend.")
+              }
+            } catch {
+              case e: Exception => logError("Error in rolling thread", e)

Review comment:
       ```suggestion
                 case e: Throwable => logError("Error in rolling thread", e)
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988581718


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50474/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988694973


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50484/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988589694






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988907124


   ```
   KubernetesSuite:
   - Run SparkPi with no resources
   - Run SparkPi with no resources & statefulset allocation
   - Run SparkPi with a very long application name.
   - Use SparkLauncher.NO_RESOURCE
   - Run SparkPi with a master URL without a scheme.
   - Run SparkPi with an argument.
   - Run SparkPi with custom labels, annotations, and environment variables.
   - All pods have the same service account by default
   - Run extraJVMOptions check on driver
   - Run SparkRemoteFileTest using a remote data file
   - Verify logging configuration is picked from the provided SPARK_CONF_DIR/log4j.properties
   - Run SparkPi with env and mount secrets.
   - Run PySpark on simple pi.py example
   - Run PySpark to test a pyfiles example
   - Run PySpark with memory customization
   - Run in client mode.
   - Start pod creation from template
   - PVs with local hostpath storage on statefulsets
   - PVs with local hostpath and storageClass on statefulsets
   - PVs with local storage
   - Launcher client dependencies
   - SPARK-33615: Launcher client archives
   - SPARK-33748: Launcher python client respecting PYSPARK_PYTHON
   - SPARK-33748: Launcher python client respecting spark.pyspark.python and spark.pyspark.driver.python
   - Launcher python client dependencies using a zip file
   - Test basic decommissioning
   - Test basic decommissioning with shuffle cleanup
   - Test decommissioning with dynamic allocation & shuffle cleanups *** FAILED ***
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988570616


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50472/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988560504


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50470/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988529101


   **[Test build #145994 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145994/testReport)** for PR 34832 at commit [`e3f06e1`](https://github.com/apache/spark/commit/e3f06e12da7207dfb6aa6a6560c7d07241370cd3).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds the following public classes _(experimental)_:
     * `class ExecutorRollPlugin extends SparkPlugin with Logging `


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988522146


   **[Test build #145994 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145994/testReport)** for PR 34832 at commit [`e3f06e1`](https://github.com/apache/spark/commit/e3f06e12da7207dfb6aa6a6560c7d07241370cd3).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988577628






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun edited a comment on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun edited a comment on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988637195


   Jenkins K8s IT finished but it looks a little weird because it's doesn't have a new test method.
   ```
   KubernetesSuite:
   - Run SparkPi with no resources
   - Run SparkPi with no resources & statefulset allocation
   - Run SparkPi with a very long application name.
   - Use SparkLauncher.NO_RESOURCE
   - Run SparkPi with a master URL without a scheme.
   - Run SparkPi with an argument.
   - Run SparkPi with custom labels, annotations, and environment variables.
   - All pods have the same service account by default
   - Run extraJVMOptions check on driver
   - Run SparkRemoteFileTest using a remote data file
   - Verify logging configuration is picked from the provided SPARK_CONF_DIR/log4j.properties
   - Run SparkPi with env and mount secrets.
   - Run PySpark on simple pi.py example
   - Run PySpark to test a pyfiles example
   - Run PySpark with memory customization
   - Run in client mode.
   - Start pod creation from template
   - PVs with local hostpath storage on statefulsets
   - PVs with local hostpath and storageClass on statefulsets
   - PVs with local storage
   - Launcher client dependencies
   - SPARK-33615: Launcher client archives
   - SPARK-33748: Launcher python client respecting PYSPARK_PYTHON
   - SPARK-33748: Launcher python client respecting spark.pyspark.python and spark.pyspark.driver.python
   - Launcher python client dependencies using a zip file
   - Test basic decommissioning
   - Test basic decommissioning with shuffle cleanup
   - Test decommissioning with dynamic allocation & shuffle cleanups *** FAILED ***
   - Test decommissioning timeouts
   - Run SparkR on simple dataframe.R example *** FAILED ***
   Run completed in 31 minutes, 58 seconds.
   Total number of tests run: 30
   Suites: completed 2, aborted 0
   Tests: succeeded 28, failed 2, canceled 0, ignored 0, pending 0
   *** 2 TESTS FAILED ***
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #34832:
URL: https://github.com/apache/spark/pull/34832#discussion_r764714660



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorRollPlugin.scala
##########
@@ -0,0 +1,100 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.scheduler.cluster.k8s
+
+import java.util.{Map => JMap}
+import java.util.concurrent.{ScheduledExecutorService, TimeUnit}
+
+import scala.collection.JavaConverters._
+
+import org.apache.spark.SparkContext
+import org.apache.spark.api.plugin.{DriverPlugin, ExecutorPlugin, PluginContext, SparkPlugin}
+import org.apache.spark.deploy.k8s.Config.EXECUTOR_ROLL_INTERVAL
+import org.apache.spark.internal.Logging
+import org.apache.spark.internal.config.DECOMMISSION_ENABLED
+import org.apache.spark.scheduler.ExecutorDecommissionInfo
+import org.apache.spark.util.ThreadUtils
+
+/**
+ * Spark plugin to roll executor pods periodically.
+ * This is independent from ExecutorPodsAllocator and aims to decommission executors
+ * one by one in both static and dynamic allocation.
+ *
+ * To use this plugin, we assume that a user has the required maximum number of executors + 1
+ * in both static and dynamic allocation configurations.
+ */
+class ExecutorRollPlugin extends SparkPlugin with Logging {
+  override def driverPlugin(): DriverPlugin = {
+    new DriverPlugin() {
+      private var sparkContext: SparkContext = _
+
+      private val periodicService: ScheduledExecutorService =
+        ThreadUtils.newDaemonSingleThreadScheduledExecutor("executor-roller")
+
+      override def init(sc: SparkContext, ctx: PluginContext): JMap[String, String] = {
+        val interval = sc.conf.get(EXECUTOR_ROLL_INTERVAL)
+        if (interval <= 0) {
+          logWarning(s"Disabled due to invalid interval value, '$interval'")
+        } else if (!sc.conf.get(DECOMMISSION_ENABLED)) {
+          logWarning("Disabled because ${DECOMMISSION_ENABLED} is false.")

Review comment:
       ```suggestion
             logWarning(s"Disabled because ${DECOMMISSION_ENABLED} is false.")
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988540070


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50470/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988529271


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145994/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun closed pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun closed pull request #34832:
URL: https://github.com/apache/spark/pull/34832


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988726888


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50484/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #34832:
URL: https://github.com/apache/spark/pull/34832#discussion_r764714660



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorRollPlugin.scala
##########
@@ -0,0 +1,100 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.scheduler.cluster.k8s
+
+import java.util.{Map => JMap}
+import java.util.concurrent.{ScheduledExecutorService, TimeUnit}
+
+import scala.collection.JavaConverters._
+
+import org.apache.spark.SparkContext
+import org.apache.spark.api.plugin.{DriverPlugin, ExecutorPlugin, PluginContext, SparkPlugin}
+import org.apache.spark.deploy.k8s.Config.EXECUTOR_ROLL_INTERVAL
+import org.apache.spark.internal.Logging
+import org.apache.spark.internal.config.DECOMMISSION_ENABLED
+import org.apache.spark.scheduler.ExecutorDecommissionInfo
+import org.apache.spark.util.ThreadUtils
+
+/**
+ * Spark plugin to roll executor pods periodically.
+ * This is independent from ExecutorPodsAllocator and aims to decommission executors
+ * one by one in both static and dynamic allocation.
+ *
+ * To use this plugin, we assume that a user has the required maximum number of executors + 1
+ * in both static and dynamic allocation configurations.
+ */
+class ExecutorRollPlugin extends SparkPlugin with Logging {
+  override def driverPlugin(): DriverPlugin = {
+    new DriverPlugin() {
+      private var sparkContext: SparkContext = _
+
+      private val periodicService: ScheduledExecutorService =
+        ThreadUtils.newDaemonSingleThreadScheduledExecutor("executor-roller")
+
+      override def init(sc: SparkContext, ctx: PluginContext): JMap[String, String] = {
+        val interval = sc.conf.get(EXECUTOR_ROLL_INTERVAL)
+        if (interval <= 0) {
+          logWarning(s"Disabled due to invalid interval value, '$interval'")
+        } else if (!sc.conf.get(DECOMMISSION_ENABLED)) {
+          logWarning("Disabled because ${DECOMMISSION_ENABLED} is false.")

Review comment:
       ```suggestion
             logWarning(s"Disabled because ${DECOMMISSION_ENABLED.key} is false.")
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988675721


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/146008/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988576855


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50474/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #34832:
URL: https://github.com/apache/spark/pull/34832#discussion_r764959186



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorRollPlugin.scala
##########
@@ -0,0 +1,100 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.scheduler.cluster.k8s
+
+import java.util.{Map => JMap}
+import java.util.concurrent.{ScheduledExecutorService, TimeUnit}
+
+import scala.collection.JavaConverters._
+
+import org.apache.spark.SparkContext
+import org.apache.spark.api.plugin.{DriverPlugin, ExecutorPlugin, PluginContext, SparkPlugin}
+import org.apache.spark.deploy.k8s.Config.EXECUTOR_ROLL_INTERVAL
+import org.apache.spark.internal.Logging
+import org.apache.spark.internal.config.DECOMMISSION_ENABLED
+import org.apache.spark.scheduler.ExecutorDecommissionInfo
+import org.apache.spark.util.ThreadUtils
+
+/**
+ * Spark plugin to roll executor pods periodically.
+ * This is independent from ExecutorPodsAllocator and aims to decommission executors
+ * one by one in both static and dynamic allocation.
+ *
+ * To use this plugin, we assume that a user has the required maximum number of executors + 1
+ * in both static and dynamic allocation configurations.
+ */
+class ExecutorRollPlugin extends SparkPlugin with Logging {
+  override def driverPlugin(): DriverPlugin = {
+    new DriverPlugin() {
+      private var sparkContext: SparkContext = _
+
+      private val periodicService: ScheduledExecutorService =
+        ThreadUtils.newDaemonSingleThreadScheduledExecutor("executor-roller")
+
+      override def init(sc: SparkContext, ctx: PluginContext): JMap[String, String] = {
+        val interval = sc.conf.get(EXECUTOR_ROLL_INTERVAL)
+        if (interval <= 0) {
+          logWarning(s"Disabled due to invalid interval value, '$interval'")
+        } else if (!sc.conf.get(DECOMMISSION_ENABLED)) {
+          logWarning("Disabled because ${DECOMMISSION_ENABLED} is false.")

Review comment:
       Thank you!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988548580


   **[Test build #145998 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145998/testReport)** for PR 34832 at commit [`7738c70`](https://github.com/apache/spark/commit/7738c7090a143586131706bb9f3a37911320ac0b).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988556566


   **[Test build #145998 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145998/testReport)** for PR 34832 at commit [`7738c70`](https://github.com/apache/spark/commit/7738c7090a143586131706bb9f3a37911320ac0b).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #34832:
URL: https://github.com/apache/spark/pull/34832#discussion_r764719668



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorRollPlugin.scala
##########
@@ -0,0 +1,100 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.scheduler.cluster.k8s
+
+import java.util.{Map => JMap}
+import java.util.concurrent.{ScheduledExecutorService, TimeUnit}
+
+import scala.collection.JavaConverters._
+
+import org.apache.spark.SparkContext
+import org.apache.spark.api.plugin.{DriverPlugin, ExecutorPlugin, PluginContext, SparkPlugin}
+import org.apache.spark.deploy.k8s.Config.EXECUTOR_ROLL_INTERVAL
+import org.apache.spark.internal.Logging
+import org.apache.spark.internal.config.DECOMMISSION_ENABLED
+import org.apache.spark.scheduler.ExecutorDecommissionInfo
+import org.apache.spark.util.ThreadUtils
+
+/**
+ * Spark plugin to roll executor pods periodically.
+ * This is independent from ExecutorPodsAllocator and aims to decommission executors
+ * one by one in both static and dynamic allocation.
+ *
+ * To use this plugin, we assume that a user has the required maximum number of executors + 1
+ * in both static and dynamic allocation configurations.
+ */
+class ExecutorRollPlugin extends SparkPlugin with Logging {
+  override def driverPlugin(): DriverPlugin = {
+    new DriverPlugin() {
+      private var sparkContext: SparkContext = _
+
+      private val periodicService: ScheduledExecutorService =
+        ThreadUtils.newDaemonSingleThreadScheduledExecutor("executor-roller")
+
+      override def init(sc: SparkContext, ctx: PluginContext): JMap[String, String] = {
+        val interval = sc.conf.get(EXECUTOR_ROLL_INTERVAL)
+        if (interval <= 0) {
+          logWarning(s"Disabled due to invalid interval value, '$interval'")
+        } else if (!sc.conf.get(DECOMMISSION_ENABLED)) {
+          logWarning("Disabled because ${DECOMMISSION_ENABLED} is false.")
+        } else {
+          // Scheduler is not created yet
+          sparkContext = sc
+
+          periodicService.scheduleAtFixedRate(() => {
+            try {
+              sparkContext.schedulerBackend match {
+                case scheduler: KubernetesClusterSchedulerBackend =>
+                  // Roughly assume that the smallest ID executor is the most long-lived one.
+                  val smallestID = scheduler
+                    .getExecutorIds()
+                    .filterNot(_.equals(SparkContext.DRIVER_IDENTIFIER))
+                    .map(_.toInt)
+                    .sorted
+                    .headOption
+                  smallestID match {
+                    case Some(id) =>
+                      // Use decommission to be safe.
+                      logInfo(s"Ask to decommission executor $id")
+                      val now = System.currentTimeMillis()
+                      scheduler.decommissionExecutor(
+                        id.toString,
+                        ExecutorDecommissionInfo(s"Rolling at $now"),
+                        adjustTargetNumExecutors = false)
+                    case _ =>
+                      logInfo("There is nothing to roll.")
+                  }
+                case _ =>
+                  logWarning("This plugin expects KubernetesClusterSchedulerBackend.")

Review comment:
       ```suggestion
                     logWarning("This plugin expects " +
                       s"${classOf[KubernetesClusterSchedulerBackend].getSimpleName}.")
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988663274


   **[Test build #146008 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146008/testReport)** for PR 34832 at commit [`3883d76`](https://github.com/apache/spark/commit/3883d766ce4eb2a02de282619bde7e62a025ab8e).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988546899


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145996/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988606326


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50478/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988663274


   **[Test build #146008 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146008/testReport)** for PR 34832 at commit [`3883d76`](https://github.com/apache/spark/commit/3883d766ce4eb2a02de282619bde7e62a025ab8e).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988522146


   **[Test build #145994 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145994/testReport)** for PR 34832 at commit [`e3f06e1`](https://github.com/apache/spark/commit/e3f06e12da7207dfb6aa6a6560c7d07241370cd3).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988548580


   **[Test build #145998 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145998/testReport)** for PR 34832 at commit [`7738c70`](https://github.com/apache/spark/commit/7738c7090a143586131706bb9f3a37911320ac0b).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988546899


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145996/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988589441


   **[Test build #146002 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146002/testReport)** for PR 34832 at commit [`2a10175`](https://github.com/apache/spark/commit/2a101754522bbccb00e31bba4a6f21a412d689a3).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988580143


   **[Test build #146002 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146002/testReport)** for PR 34832 at commit [`2a10175`](https://github.com/apache/spark/commit/2a101754522bbccb00e31bba4a6f21a412d689a3).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988638697


   I found that Jenkins only runs `k8sTestTag`. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988581718


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50474/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988600740


   Hi, @HyukjinKwon . Could you review this PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988577630






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988909924


   Since the recent comments are mostly about string changes, I'll merge this. Thank you so much for reviews and comments, @HyukjinKwon . Merged to master for Apache Spark 3.3.0.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988580143


   **[Test build #146002 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146002/testReport)** for PR 34832 at commit [`2a10175`](https://github.com/apache/spark/commit/2a101754522bbccb00e31bba4a6f21a412d689a3).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #34832:
URL: https://github.com/apache/spark/pull/34832#discussion_r764721722



##########
File path: resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/DecommissionSuite.scala
##########
@@ -176,6 +177,34 @@ private[spark] trait DecommissionSuite { k8sSuite: KubernetesSuite =>
       executorPatience = None,
       decommissioningTest = true)
   }
+
+  test("Rolling decommissioning", k8sTestTag) {

Review comment:
       ```suggestion
     test("SPARK-37576: Rolling decommissioning", k8sTestTag) {
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988675419


   **[Test build #146008 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146008/testReport)** for PR 34832 at commit [`3883d76`](https://github.com/apache/spark/commit/3883d766ce4eb2a02de282619bde7e62a025ab8e).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988745871


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50484/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun edited a comment on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun edited a comment on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988907124


   ```
   KubernetesSuite:
   - Run SparkPi with no resources
   - Run SparkPi with no resources & statefulset allocation
   - Run SparkPi with a very long application name.
   - Use SparkLauncher.NO_RESOURCE
   - Run SparkPi with a master URL without a scheme.
   - Run SparkPi with an argument.
   - Run SparkPi with custom labels, annotations, and environment variables.
   - All pods have the same service account by default
   - Run extraJVMOptions check on driver
   - Run SparkRemoteFileTest using a remote data file
   - Verify logging configuration is picked from the provided SPARK_CONF_DIR/log4j.properties
   - Run SparkPi with env and mount secrets.
   - Run PySpark on simple pi.py example
   - Run PySpark to test a pyfiles example
   - Run PySpark with memory customization
   - Run in client mode.
   - Start pod creation from template
   - PVs with local hostpath storage on statefulsets
   - PVs with local hostpath and storageClass on statefulsets
   - PVs with local storage
   - Launcher client dependencies
   - SPARK-33615: Launcher client archives
   - SPARK-33748: Launcher python client respecting PYSPARK_PYTHON
   - SPARK-33748: Launcher python client respecting spark.pyspark.python and spark.pyspark.driver.python
   - Launcher python client dependencies using a zip file
   - Test basic decommissioning
   - Test basic decommissioning with shuffle cleanup
   - Test decommissioning with dynamic allocation & shuffle cleanups *** FAILED ***
   - Test decommissioning timeouts
   - Rolling decommissioning
   - Run SparkR on simple dataframe.R example *** FAILED ***
     The code passed to eventually never returned normally. Attempted 190 times over 3.0010038906833336 minutes. Last failure message: false was not true. (KubernetesSuite.scala:452)
   Run completed in 34 minutes, 58 seconds.
   Total number of tests run: 31
   Suites: completed 2, aborted 0
   Tests: succeeded 29, failed 2, canceled 0, ignored 0, pending 0
   *** 2 TESTS FAILED ***
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988525801


   **[Test build #145996 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145996/testReport)** for PR 34832 at commit [`4bc0851`](https://github.com/apache/spark/commit/4bc0851240edbccf7848d561d2a67f3275516021).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988547067


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50472/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988581692


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50474/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988659934


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50478/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988529271


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145994/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988525801


   **[Test build #145996 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145996/testReport)** for PR 34832 at commit [`4bc0851`](https://github.com/apache/spark/commit/4bc0851240edbccf7848d561d2a67f3275516021).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988745871


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50484/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988637195


   Jenkins K8s IT finished but it looks a little weird because it's doesn't have a new method.
   ```
   KubernetesSuite:
   - Run SparkPi with no resources
   - Run SparkPi with no resources & statefulset allocation
   - Run SparkPi with a very long application name.
   - Use SparkLauncher.NO_RESOURCE
   - Run SparkPi with a master URL without a scheme.
   - Run SparkPi with an argument.
   - Run SparkPi with custom labels, annotations, and environment variables.
   - All pods have the same service account by default
   - Run extraJVMOptions check on driver
   - Run SparkRemoteFileTest using a remote data file
   - Verify logging configuration is picked from the provided SPARK_CONF_DIR/log4j.properties
   - Run SparkPi with env and mount secrets.
   - Run PySpark on simple pi.py example
   - Run PySpark to test a pyfiles example
   - Run PySpark with memory customization
   - Run in client mode.
   - Start pod creation from template
   - PVs with local hostpath storage on statefulsets
   - PVs with local hostpath and storageClass on statefulsets
   - PVs with local storage
   - Launcher client dependencies
   - SPARK-33615: Launcher client archives
   - SPARK-33748: Launcher python client respecting PYSPARK_PYTHON
   - SPARK-33748: Launcher python client respecting spark.pyspark.python and spark.pyspark.driver.python
   - Launcher python client dependencies using a zip file
   - Test basic decommissioning
   - Test basic decommissioning with shuffle cleanup
   - Test decommissioning with dynamic allocation & shuffle cleanups *** FAILED ***
   - Test decommissioning timeouts
   - Run SparkR on simple dataframe.R example *** FAILED ***
   Run completed in 31 minutes, 58 seconds.
   Total number of tests run: 30
   Suites: completed 2, aborted 0
   Tests: succeeded 28, failed 2, canceled 0, ignored 0, pending 0
   *** 2 TESTS FAILED ***
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #34832:
URL: https://github.com/apache/spark/pull/34832#discussion_r764721321



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorRollPlugin.scala
##########
@@ -0,0 +1,100 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.scheduler.cluster.k8s
+
+import java.util.{Map => JMap}
+import java.util.concurrent.{ScheduledExecutorService, TimeUnit}
+
+import scala.collection.JavaConverters._
+
+import org.apache.spark.SparkContext
+import org.apache.spark.api.plugin.{DriverPlugin, ExecutorPlugin, PluginContext, SparkPlugin}
+import org.apache.spark.deploy.k8s.Config.EXECUTOR_ROLL_INTERVAL
+import org.apache.spark.internal.Logging
+import org.apache.spark.internal.config.DECOMMISSION_ENABLED
+import org.apache.spark.scheduler.ExecutorDecommissionInfo
+import org.apache.spark.util.ThreadUtils
+
+/**
+ * Spark plugin to roll executor pods periodically.
+ * This is independent from ExecutorPodsAllocator and aims to decommission executors
+ * one by one in both static and dynamic allocation.
+ *
+ * To use this plugin, we assume that a user has the required maximum number of executors + 1
+ * in both static and dynamic allocation configurations.
+ */
+class ExecutorRollPlugin extends SparkPlugin with Logging {
+  override def driverPlugin(): DriverPlugin = {
+    new DriverPlugin() {
+      private var sparkContext: SparkContext = _
+
+      private val periodicService: ScheduledExecutorService =
+        ThreadUtils.newDaemonSingleThreadScheduledExecutor("executor-roller")
+
+      override def init(sc: SparkContext, ctx: PluginContext): JMap[String, String] = {
+        val interval = sc.conf.get(EXECUTOR_ROLL_INTERVAL)
+        if (interval <= 0) {
+          logWarning(s"Disabled due to invalid interval value, '$interval'")
+        } else if (!sc.conf.get(DECOMMISSION_ENABLED)) {
+          logWarning("Disabled because ${DECOMMISSION_ENABLED} is false.")
+        } else {
+          // Scheduler is not created yet
+          sparkContext = sc
+
+          periodicService.scheduleAtFixedRate(() => {
+            try {
+              sparkContext.schedulerBackend match {
+                case scheduler: KubernetesClusterSchedulerBackend =>
+                  // Roughly assume that the smallest ID executor is the most long-lived one.
+                  val smallestID = scheduler
+                    .getExecutorIds()
+                    .filterNot(_.equals(SparkContext.DRIVER_IDENTIFIER))
+                    .map(_.toInt)
+                    .sorted
+                    .headOption
+                  smallestID match {
+                    case Some(id) =>
+                      // Use decommission to be safe.
+                      logInfo(s"Ask to decommission executor $id")
+                      val now = System.currentTimeMillis()
+                      scheduler.decommissionExecutor(
+                        id.toString,
+                        ExecutorDecommissionInfo(s"Rolling at $now"),
+                        adjustTargetNumExecutors = false)
+                    case _ =>
+                      logInfo("There is nothing to roll.")
+                  }
+                case _ =>
+                  logWarning("This plugin expects KubernetesClusterSchedulerBackend.")
+              }
+            } catch {
+              case e: Exception => logError("Error in rolling thread", e)
+            }
+          }, interval, interval, TimeUnit.SECONDS)
+        }
+        Map.empty[String, String].asJava
+      }
+
+      override def shutdown(): Unit = {
+        periodicService.shutdown()
+      }

Review comment:
       ```suggestion
         override def shutdown(): Unit = periodicService.shutdown()
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988675721


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/146008/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun edited a comment on pull request #34832: [SPARK-37576][K8S] Support built-in K8s executor rolling plugin

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun edited a comment on pull request #34832:
URL: https://github.com/apache/spark/pull/34832#issuecomment-988907124


   Previously, the newly added test case passed and two failures are irrelevant to this PR.
   ```
   KubernetesSuite:
   - Run SparkPi with no resources
   - Run SparkPi with no resources & statefulset allocation
   - Run SparkPi with a very long application name.
   - Use SparkLauncher.NO_RESOURCE
   - Run SparkPi with a master URL without a scheme.
   - Run SparkPi with an argument.
   - Run SparkPi with custom labels, annotations, and environment variables.
   - All pods have the same service account by default
   - Run extraJVMOptions check on driver
   - Run SparkRemoteFileTest using a remote data file
   - Verify logging configuration is picked from the provided SPARK_CONF_DIR/log4j.properties
   - Run SparkPi with env and mount secrets.
   - Run PySpark on simple pi.py example
   - Run PySpark to test a pyfiles example
   - Run PySpark with memory customization
   - Run in client mode.
   - Start pod creation from template
   - PVs with local hostpath storage on statefulsets
   - PVs with local hostpath and storageClass on statefulsets
   - PVs with local storage
   - Launcher client dependencies
   - SPARK-33615: Launcher client archives
   - SPARK-33748: Launcher python client respecting PYSPARK_PYTHON
   - SPARK-33748: Launcher python client respecting spark.pyspark.python and spark.pyspark.driver.python
   - Launcher python client dependencies using a zip file
   - Test basic decommissioning
   - Test basic decommissioning with shuffle cleanup
   - Test decommissioning with dynamic allocation & shuffle cleanups *** FAILED ***
   - Test decommissioning timeouts
   - Rolling decommissioning
   - Run SparkR on simple dataframe.R example *** FAILED ***
     The code passed to eventually never returned normally. Attempted 190 times over 3.0010038906833336 minutes. Last failure message: false was not true. (KubernetesSuite.scala:452)
   Run completed in 34 minutes, 58 seconds.
   Total number of tests run: 31
   Suites: completed 2, aborted 0
   Tests: succeeded 29, failed 2, canceled 0, ignored 0, pending 0
   *** 2 TESTS FAILED ***
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org