You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/02/17 12:31:23 UTC

[GitHub] [spark] Yikun opened a new pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Yikun opened a new pull request #35553:
URL: https://github.com/apache/spark/pull/35553


   ### What changes were proposed in this pull request?
   This patch adds the queue configuration for cutomized scheduler, and also add Volcano implementaions, also add a integrations test.
   
   ### Why are the changes needed?
   Support queue scheduling with Volcano implementations.
   
   
   ### Does this PR introduce _any_ user-facing change?
   Yes, introduce a new configuration `spark.kubernetes.job.queue`.
   
   ### How was this patch tested?
   - UT passed.
   - Spark on Kubernetes Integration test passed.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r816383094



##########
File path: resource-managers/kubernetes/core/src/test/scala/org/apache/spark/deploy/k8s/features/VolcanoFeatureStepSuite.scala
##########
@@ -20,11 +20,13 @@ import io.fabric8.volcano.scheduling.v1beta1.PodGroup
 
 import org.apache.spark.{SparkConf, SparkFunSuite}
 import org.apache.spark.deploy.k8s._
+import org.apache.spark.deploy.k8s.Config._
 
 class VolcanoFeatureStepSuite extends SparkFunSuite {
 
   test("SPARK-36061: Driver Pod with Volcano PodGroup") {
     val sparkConf = new SparkConf()
+      .set(KUBERNETES_JOB_QUEUE.key, "queue")

Review comment:
       Please don't touch this test case to cover the default case. Instead add another test case for SPARK-38188.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r809349756



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/VolcanoFeatureStep.scala
##########
@@ -45,6 +47,9 @@ private[spark] class VolcanoFeatureStep extends KubernetesDriverCustomFeatureCon
         .withName(podGroupName)
         .withNamespace(namespace)
       .endMetadata()
+      .editOrNewSpec()
+        .withQueue(queue)
+      .endSpec()

Review comment:
       Please don't add this to the spec if there is no given queue from the users.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on a change in pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
Yikun commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r812960434



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/VolcanoFeatureStep.scala
##########
@@ -20,6 +20,7 @@ import io.fabric8.kubernetes.api.model._
 import io.fabric8.volcano.scheduling.v1beta1.PodGroupBuilder
 
 import org.apache.spark.deploy.k8s.{KubernetesConf, KubernetesDriverConf, KubernetesExecutorConf, SparkPod}
+import org.apache.spark.deploy.k8s.Config._

Review comment:
       We will also introduce serveral configurations in followup, so maybe `Config._` is better




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r816389913



##########
File path: resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/VolcanoTestsSuite.scala
##########
@@ -63,8 +150,40 @@ private[spark] trait VolcanoTestsSuite { k8sSuite: KubernetesSuite =>
       }
     )
   }
+
+  test("Run 4 SparkPi Jobs with 2 volcano queues (queue scheduling)", k8sTestTag, volcanoTag) {
+    createOrReplaceYAMLResource(VOLCANO_Q0_DISABLE_Q1_ENABLE_YAML)
+    val jobNum = 4
+    val groupLocator = "queue-test" + UUID.randomUUID().toString.replaceAll("-", "")
+    // Submit two jobs into disabled queue0 and enabled queue1
+    (1 to jobNum).foreach { i =>
+      Future {
+        runJobAndVerify(i.toString, Option(groupLocator), Option(s"queue${i % 2}"))
+      }
+    }
+    // There are two `Succeeded` jobs and two `Pending` jobs
+    Eventually.eventually(TIMEOUT, INTERVAL) {
+      val completedPods = getPods("driver", groupLocator, "Succeeded")
+      assert(completedPods.size === 2)
+      val pendingPods = getPods("driver", groupLocator, "Pending")
+      assert(pendingPods.size === 2)
+    }
+    // Now, enable all queues, then all jobs completed
+    createOrReplaceYAMLResource(VOLCANO_ENABLE_Q0_AND_Q1_YAML)

Review comment:
       Please make a separate test case for this.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r816401326



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala
##########
@@ -292,6 +292,14 @@ private[spark] object Config extends Logging {
       .stringConf
       .createOptional
 
+  val KUBERNETES_JOB_QUEUE = ConfigBuilder("spark.kubernetes.job.queue")
+    .doc("The name of the queue to which the job is submitted. This info " +
+      "will be stored in configuration and passed to specified feature step (such as " +
+      "`VolcanoFeatureStep`).")

Review comment:
       Please remove the following.
   ```
   (such as `VolcanoFeatureStep`)
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r816399653



##########
File path: resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/VolcanoTestsSuite.scala
##########
@@ -16,16 +16,33 @@
  */
 package org.apache.spark.deploy.k8s.integrationtest
 
+import java.io.{File, FileInputStream}
+import java.util.UUID
+
+import scala.collection.JavaConverters._
+import scala.collection.mutable
+import scala.concurrent._
+
+import ExecutionContext.Implicits.global

Review comment:
       Actually, SPARK-37891 discourage this. If you need to use this please wrap this with `// scalastyle:off executioncontextglobal` and on.
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r816393973



##########
File path: resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/VolcanoTestsSuite.scala
##########
@@ -37,12 +54,82 @@ private[spark] trait VolcanoTestsSuite { k8sSuite: KubernetesSuite =>
     assert(annotations.get("scheduling.k8s.io/group-name") === s"$appId-podgroup")
   }
 
-  protected def checkPodGroup(pod: Pod): Unit = {
+  protected def checkPodGroup(
+      pod: Pod,
+      queue: Option[String] = None): Unit = {
     val appId = pod.getMetadata.getLabels.get("spark-app-selector")
     val podGroupName = s"$appId-podgroup"
-    val volcanoClient = kubernetesTestComponents.kubernetesClient.adapt(classOf[VolcanoClient])
     val podGroup = volcanoClient.podGroups().withName(podGroupName).get()
     assert(podGroup.getMetadata.getOwnerReferences.get(0).getName === pod.getMetadata.getName)
+    val spec = podGroup.getSpec
+    if (queue.isDefined) assert(spec.getQueue === queue.get)
+  }
+
+  private def createOrReplaceYAMLResource(yamlPath: String): Unit = {
+    k8sClient.load(new FileInputStream(yamlPath)).createOrReplace()
+  }
+
+  private def deleteYAMLResource(yamlPath: String): Unit = {
+    k8sClient.load(new FileInputStream(yamlPath)).delete()
+  }
+
+  private def getPods(
+      role: String,
+      groupLocator: String,
+      statusPhase: String): mutable.Buffer[Pod] = {
+    k8sClient
+      .pods()
+      .withLabel("spark-group-locator", groupLocator)
+      .withLabel("spark-role", role)
+      .withField("status.phase", statusPhase)
+      .list()
+      .getItems.asScala
+  }
+
+  def runJobAndVerify(
+      batchSuffix: String,
+      groupLoc: Option[String] = None,
+      queue: Option[String] = None): Unit = {
+    val appLoc = s"${appLocator}${batchSuffix}"
+    val podName = s"${driverPodName}-${batchSuffix}"
+    // create new configuration for every job
+    val conf = createVolcanoSparkConf(podName, appLoc, groupLoc, queue)
+    runSparkPiAndVerifyCompletion(
+      driverPodChecker = (driverPod: Pod) => {
+        checkScheduler(driverPod)
+        checkAnnotaion(driverPod)
+        checkPodGroup(driverPod, queue)
+      },
+      executorPodChecker = (executorPod: Pod) => {
+        checkScheduler(executorPod)
+        checkAnnotaion(executorPod)
+      },
+      customSparkConf = Option(conf),
+      customAppLocator = Option(appLoc)
+    )
+  }
+
+  private def createVolcanoSparkConf(
+      driverPodName: String = driverPodName,
+      appLoc: String = appLocator,
+      groupLoc: Option[String] = None,
+      queue: Option[String] = None): SparkAppConf = {
+    val conf = kubernetesTestComponents.newSparkAppConf()
+      .set(CONTAINER_IMAGE.key, image)
+      .set(KUBERNETES_DRIVER_POD_NAME.key, driverPodName)
+      .set(s"${KUBERNETES_DRIVER_LABEL_PREFIX}spark-app-locator", appLoc)
+      .set(s"${KUBERNETES_EXECUTOR_LABEL_PREFIX}spark-app-locator", appLoc)
+      .set(NETWORK_AUTH_ENABLED.key, "true")
+      // below is volcano specific configuration
+      .set(KUBERNETES_SCHEDULER_NAME.key, "volcano")
+      .set(KUBERNETES_DRIVER_POD_FEATURE_STEPS.key, VOLCANO_FEATURE_STEP)
+      .set(KUBERNETES_EXECUTOR_POD_FEATURE_STEPS.key, VOLCANO_FEATURE_STEP)
+    if (queue.isDefined) conf.set(KUBERNETES_JOB_QUEUE.key, queue.get)
+    if (groupLoc.isDefined) {
+      conf.set(s"${KUBERNETES_DRIVER_LABEL_PREFIX}spark-group-locator", groupLoc.get)
+      conf.set(s"${KUBERNETES_EXECUTOR_LABEL_PREFIX}spark-group-locator", groupLoc.get)
+    }

Review comment:
       ```scala
       groupLoc.foreach { locator =>
         conf.set(s"${KUBERNETES_DRIVER_LABEL_PREFIX}spark-group-locator", locator)
         conf.set(s"${KUBERNETES_EXECUTOR_LABEL_PREFIX}spark-group-locator", locator)
       }
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun edited a comment on pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
Yikun edited a comment on pull request #35553:
URL: https://github.com/apache/spark/pull/35553#issuecomment-1042976528


   Kubernetes K8S integration test passed:
   <details><summary> Click to see results</summary>
   
   ```
   $ build/sbt -Pvolcano -Pkubernetes -Pkubernetes-integration-tests -Dtest.exclude.tags=minikube,r -Dspark.kubernetes.test.namespace=default "kubernetes-integration-tests/test"
   
   [info] KubernetesSuite:
   [info] - Run SparkPi with no resources (10 seconds, 946 milliseconds)
   [info] - Run SparkPi with no resources & statefulset allocation (10 seconds, 714 milliseconds)
   [info] - Run SparkPi with a very long application name. (11 seconds, 494 milliseconds)
   [info] - Use SparkLauncher.NO_RESOURCE (10 seconds, 544 milliseconds)
   [info] - Run SparkPi with a master URL without a scheme. (11 seconds, 588 milliseconds)
   [info] - Run SparkPi with an argument. (11 seconds, 619 milliseconds)
   [info] - Run SparkPi with custom labels, annotations, and environment variables. (10 seconds, 555 milliseconds)
   [info] - All pods have the same service account by default (11 seconds, 490 milliseconds)
   [info] - Run extraJVMOptions check on driver (5 seconds, 467 milliseconds)
   [info] - Run SparkRemoteFileTest using a remote data file (10 seconds, 745 milliseconds)
   [info] - Verify logging configuration is picked from the provided SPARK_CONF_DIR/log4j2.properties (18 seconds, 999 milliseconds)
   [info] - Run SparkPi with env and mount secrets. (20 seconds, 363 milliseconds)
   [info] - Run PySpark on simple pi.py example (12 seconds, 653 milliseconds)
   [info] - Run PySpark to test a pyfiles example (14 seconds, 644 milliseconds)
   [info] - Run PySpark with memory customization (12 seconds, 467 milliseconds)
   [info] - Run in client mode. (9 seconds, 182 milliseconds)
   [info] - Start pod creation from template (11 seconds, 619 milliseconds)
   [info] - Test basic decommissioning (45 seconds, 783 milliseconds)
   [info] - Test basic decommissioning with shuffle cleanup (46 seconds, 198 milliseconds)
   [info] - Test decommissioning with dynamic allocation & shuffle cleanups (2 minutes, 45 seconds)
   [info] - Test decommissioning timeouts (45 seconds, 242 milliseconds)
   [info] - SPARK-37576: Rolling decommissioning (1 minute, 7 seconds)
   [info] VolcanoSuite:
   [info] - Run SparkPi with no resources (12 seconds, 548 milliseconds)
   [info] - Run SparkPi with no resources & statefulset allocation (11 seconds, 638 milliseconds)
   [info] - Run SparkPi with a very long application name. (11 seconds, 565 milliseconds)
   [info] - Use SparkLauncher.NO_RESOURCE (11 seconds, 492 milliseconds)
   [info] - Run SparkPi with a master URL without a scheme. (11 seconds, 465 milliseconds)
   [info] - Run SparkPi with an argument. (11 seconds, 617 milliseconds)
   [info] - Run SparkPi with custom labels, annotations, and environment variables. (12 seconds, 538 milliseconds)
   [info] - All pods have the same service account by default (11 seconds, 476 milliseconds)
   [info] - Run extraJVMOptions check on driver (6 seconds, 457 milliseconds)
   [info] - Run SparkRemoteFileTest using a remote data file (12 seconds, 536 milliseconds)
   [info] - Verify logging configuration is picked from the provided SPARK_CONF_DIR/log4j2.properties (18 seconds, 953 milliseconds)
   [info] - Run SparkPi with env and mount secrets. (20 seconds, 923 milliseconds)
   [info] - Run PySpark on simple pi.py example (13 seconds, 594 milliseconds)
   [info] - Run PySpark to test a pyfiles example (16 seconds, 584 milliseconds)
   [info] - Run PySpark with memory customization (12 seconds, 466 milliseconds)
   [info] - Run in client mode. (9 seconds, 125 milliseconds)
   [info] - Start pod creation from template (11 seconds, 642 milliseconds)
   [info] - Test basic decommissioning (46 seconds, 809 milliseconds)
   [info] - Test basic decommissioning with shuffle cleanup (48 seconds, 186 milliseconds)
   [info] - Test decommissioning with dynamic allocation & shuffle cleanups (2 minutes, 47 seconds)
   [info] - Test decommissioning timeouts (47 seconds, 200 milliseconds)
   [info] - SPARK-37576: Rolling decommissioning (1 minute, 8 seconds)
   [info] - Run SparkPi with volcano scheduler (11 seconds, 693 milliseconds)
   [info] - Run 4 SparkPi Jobs with 2 volcano queues (queue scheduling) (30 seconds, 590 milliseconds)
   ```
   
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
Yikun commented on pull request #35553:
URL: https://github.com/apache/spark/pull/35553#issuecomment-1043588796


   ping @holdenk @tgravescs @mridulm @yangwwei @yaooqinn @rdblue @HyukjinKwon 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on pull request #35553: [SPARK-38188][K8S] Support `spark.kubernetes.job.queue`

Posted by GitBox <gi...@apache.org>.
Yikun commented on pull request #35553:
URL: https://github.com/apache/spark/pull/35553#issuecomment-1055029844


   ```
   [info] VolcanoSuite:
   [info] - Run SparkPi with volcano scheduler (18 seconds, 743 milliseconds)
   [info] - SPARK-38188: Run SparkPi jobs with 2 queues (only 1 enable) (22 seconds, 661 milliseconds)
   [info] - SPARK-38188: Run SparkPi jobs with 2 queues (all enable) (27 seconds, 715 milliseconds)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r816400570



##########
File path: resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/VolcanoTestsSuite.scala
##########
@@ -16,16 +16,33 @@
  */
 package org.apache.spark.deploy.k8s.integrationtest
 
+import java.io.{File, FileInputStream}
+import java.util.UUID
+
+import scala.collection.JavaConverters._
+import scala.collection.mutable
+import scala.concurrent._

Review comment:
       Since  you are using a single class, `Future`, please import directly `import scala.concurrent.Future`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on a change in pull request #35553: [SPARK-38188][K8S] Support `spark.kubernetes.job.queue`

Posted by GitBox <gi...@apache.org>.
Yikun commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r816451073



##########
File path: resource-managers/kubernetes/integration-tests/src/test/resources/volcano/disable-queue0-enable-queue1.yml
##########
@@ -0,0 +1,29 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+apiVersion: scheduling.volcano.sh/v1beta1

Review comment:
       At present, there is no clear timeline for apiVersion, but it is in roadmap, but at least there will be no change before Spark 3.3 release. also cc @william-wang




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on a change in pull request #35553: [SPARK-38188][K8S] Support `spark.kubernetes.job.queue`

Posted by GitBox <gi...@apache.org>.
Yikun commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r816452421



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala
##########
@@ -292,6 +292,14 @@ private[spark] object Config extends Logging {
       .stringConf
       .createOptional
 
+  val KUBERNETES_JOB_QUEUE = ConfigBuilder("spark.kubernetes.job.queue")

Review comment:
       I can get your concern in here, I think it's a good note. : )
   
   The short plan is we need to also add `minCPU`/`priorityClassName` in here, and there are also some pontential job related introduce in future.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on pull request #35553: [SPARK-38188][K8S] Support `spark.kubernetes.job.queue`

Posted by GitBox <gi...@apache.org>.
Yikun commented on pull request #35553:
URL: https://github.com/apache/spark/pull/35553#issuecomment-1055029682


   @dongjoon-hyun Thanks for your help!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r816388289



##########
File path: resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/VolcanoTestsSuite.scala
##########
@@ -37,12 +54,82 @@ private[spark] trait VolcanoTestsSuite { k8sSuite: KubernetesSuite =>
     assert(annotations.get("scheduling.k8s.io/group-name") === s"$appId-podgroup")
   }
 
-  protected def checkPodGroup(pod: Pod): Unit = {
+  protected def checkPodGroup(
+      pod: Pod,
+      queue: Option[String] = None): Unit = {
     val appId = pod.getMetadata.getLabels.get("spark-app-selector")
     val podGroupName = s"$appId-podgroup"
-    val volcanoClient = kubernetesTestComponents.kubernetesClient.adapt(classOf[VolcanoClient])
     val podGroup = volcanoClient.podGroups().withName(podGroupName).get()
     assert(podGroup.getMetadata.getOwnerReferences.get(0).getName === pod.getMetadata.getName)
+    val spec = podGroup.getSpec
+    if (queue.isDefined) assert(spec.getQueue === queue.get)

Review comment:
       ```scala
   - val spec = podGroup.getSpec
   - if (queue.isDefined) assert(spec.getQueue === queue.get)
   + queue.foreach(q => assert(q === podGroup.getSpec.getQueue))
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun edited a comment on pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
Yikun edited a comment on pull request #35553:
URL: https://github.com/apache/spark/pull/35553#issuecomment-1042976528


   Kubernetes K8S integration test passed:
   <details><summary> Click to see results</summary>
   ```
   $ build/sbt -Pvolcano -Pkubernetes -Pkubernetes-integration-tests -Dtest.exclude.tags=minikube,r -Dspark.kubernetes.test.namespace=default "kubernetes-integration-tests/test"
   
   [info] KubernetesSuite:
   [info] - Run SparkPi with no resources (10 seconds, 946 milliseconds)
   [info] - Run SparkPi with no resources & statefulset allocation (10 seconds, 714 milliseconds)
   [info] - Run SparkPi with a very long application name. (11 seconds, 494 milliseconds)
   [info] - Use SparkLauncher.NO_RESOURCE (10 seconds, 544 milliseconds)
   [info] - Run SparkPi with a master URL without a scheme. (11 seconds, 588 milliseconds)
   [info] - Run SparkPi with an argument. (11 seconds, 619 milliseconds)
   [info] - Run SparkPi with custom labels, annotations, and environment variables. (10 seconds, 555 milliseconds)
   [info] - All pods have the same service account by default (11 seconds, 490 milliseconds)
   [info] - Run extraJVMOptions check on driver (5 seconds, 467 milliseconds)
   [info] - Run SparkRemoteFileTest using a remote data file (10 seconds, 745 milliseconds)
   [info] - Verify logging configuration is picked from the provided SPARK_CONF_DIR/log4j2.properties (18 seconds, 999 milliseconds)
   [info] - Run SparkPi with env and mount secrets. (20 seconds, 363 milliseconds)
   [info] - Run PySpark on simple pi.py example (12 seconds, 653 milliseconds)
   [info] - Run PySpark to test a pyfiles example (14 seconds, 644 milliseconds)
   [info] - Run PySpark with memory customization (12 seconds, 467 milliseconds)
   [info] - Run in client mode. (9 seconds, 182 milliseconds)
   [info] - Start pod creation from template (11 seconds, 619 milliseconds)
   [info] - Test basic decommissioning (45 seconds, 783 milliseconds)
   [info] - Test basic decommissioning with shuffle cleanup (46 seconds, 198 milliseconds)
   [info] - Test decommissioning with dynamic allocation & shuffle cleanups (2 minutes, 45 seconds)
   [info] - Test decommissioning timeouts (45 seconds, 242 milliseconds)
   [info] - SPARK-37576: Rolling decommissioning (1 minute, 7 seconds)
   [info] VolcanoSuite:
   [info] - Run SparkPi with no resources (12 seconds, 548 milliseconds)
   [info] - Run SparkPi with no resources & statefulset allocation (11 seconds, 638 milliseconds)
   [info] - Run SparkPi with a very long application name. (11 seconds, 565 milliseconds)
   [info] - Use SparkLauncher.NO_RESOURCE (11 seconds, 492 milliseconds)
   [info] - Run SparkPi with a master URL without a scheme. (11 seconds, 465 milliseconds)
   [info] - Run SparkPi with an argument. (11 seconds, 617 milliseconds)
   [info] - Run SparkPi with custom labels, annotations, and environment variables. (12 seconds, 538 milliseconds)
   [info] - All pods have the same service account by default (11 seconds, 476 milliseconds)
   [info] - Run extraJVMOptions check on driver (6 seconds, 457 milliseconds)
   [info] - Run SparkRemoteFileTest using a remote data file (12 seconds, 536 milliseconds)
   [info] - Verify logging configuration is picked from the provided SPARK_CONF_DIR/log4j2.properties (18 seconds, 953 milliseconds)
   [info] - Run SparkPi with env and mount secrets. (20 seconds, 923 milliseconds)
   [info] - Run PySpark on simple pi.py example (13 seconds, 594 milliseconds)
   [info] - Run PySpark to test a pyfiles example (16 seconds, 584 milliseconds)
   [info] - Run PySpark with memory customization (12 seconds, 466 milliseconds)
   [info] - Run in client mode. (9 seconds, 125 milliseconds)
   [info] - Start pod creation from template (11 seconds, 642 milliseconds)
   [info] - Test basic decommissioning (46 seconds, 809 milliseconds)
   [info] - Test basic decommissioning with shuffle cleanup (48 seconds, 186 milliseconds)
   [info] - Test decommissioning with dynamic allocation & shuffle cleanups (2 minutes, 47 seconds)
   [info] - Test decommissioning timeouts (47 seconds, 200 milliseconds)
   [info] - SPARK-37576: Rolling decommissioning (1 minute, 8 seconds)
   [info] - Run SparkPi with volcano scheduler (11 seconds, 693 milliseconds)
   [info] - Run 4 SparkPi Jobs with 2 volcano queues (queue scheduling) (30 seconds, 590 milliseconds)
   ```
   
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dcoliversun commented on a change in pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
dcoliversun commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r813770615



##########
File path: resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/KubernetesSuite.scala
##########
@@ -212,7 +212,9 @@ class KubernetesSuite extends SparkFunSuite
       driverPodChecker: Pod => Unit = doBasicDriverPodCheck,
       executorPodChecker: Pod => Unit = doBasicExecutorPodCheck,
       appArgs: Array[String] = Array.empty[String],
-      isJVM: Boolean = true ): Unit = {
+      isJVM: Boolean = true,
+      sparkConfOverride: Option[SparkAppConf] = None,
+      appLocatorOverride: Option[String] = None): Unit = {

Review comment:
       @martin-g Thanks for your sharing. I also agree that `customSparkConf` is a good choice.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r809350482



##########
File path: resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/VolcanoTestsSuite.scala
##########
@@ -16,16 +16,33 @@
  */
 package org.apache.spark.deploy.k8s.integrationtest
 
+import java.io.{File, FileInputStream}
+import java.util.UUID
+
+import scala.collection.JavaConverters._
+import scala.collection.mutable
+import scala.concurrent._
+
+import ExecutionContext.Implicits.global
 import io.fabric8.kubernetes.api.model.Pod
+import io.fabric8.kubernetes.client.NamespacedKubernetesClient
 import io.fabric8.volcano.client.VolcanoClient
+import org.scalatest.concurrent.Eventually
 
 import org.apache.spark.SparkFunSuite
+import org.apache.spark.deploy.k8s.Config._
 import org.apache.spark.deploy.k8s.features.VolcanoFeatureStep
-import org.apache.spark.deploy.k8s.integrationtest.KubernetesSuite.k8sTestTag
-import org.apache.spark.deploy.k8s.integrationtest.VolcanoSuite.volcanoTag
+import org.apache.spark.internal.config.NETWORK_AUTH_ENABLED
 
 private[spark] trait VolcanoTestsSuite { k8sSuite: KubernetesSuite =>
+  import VolcanoSuite._
   import VolcanoTestsSuite._
+  import KubernetesSuite._
+
+  lazy val volcanoClient: VolcanoClient
+  = kubernetesTestComponents.kubernetesClient.adapt(classOf[VolcanoClient])

Review comment:
       indentation.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dcoliversun commented on a change in pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
dcoliversun commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r812941826



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/VolcanoFeatureStep.scala
##########
@@ -20,6 +20,7 @@ import io.fabric8.kubernetes.api.model._
 import io.fabric8.volcano.scheduling.v1beta1.PodGroupBuilder
 
 import org.apache.spark.deploy.k8s.{KubernetesConf, KubernetesDriverConf, KubernetesExecutorConf, SparkPod}
+import org.apache.spark.deploy.k8s.Config._

Review comment:
       How about only import `KUBERNETES_JOB_QUEUE`?

##########
File path: resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/KubernetesSuite.scala
##########
@@ -212,7 +212,9 @@ class KubernetesSuite extends SparkFunSuite
       driverPodChecker: Pod => Unit = doBasicDriverPodCheck,
       executorPodChecker: Pod => Unit = doBasicExecutorPodCheck,
       appArgs: Array[String] = Array.empty[String],
-      isJVM: Boolean = true ): Unit = {
+      isJVM: Boolean = true,
+      sparkConfOverride: Option[SparkAppConf] = None,
+      appLocatorOverride: Option[String] = None): Unit = {

Review comment:
       Hi, I don't understand what word `override` means. Could you explain it?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on a change in pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
Yikun commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r812964505



##########
File path: resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/KubernetesSuite.scala
##########
@@ -212,7 +212,9 @@ class KubernetesSuite extends SparkFunSuite
       driverPodChecker: Pod => Unit = doBasicDriverPodCheck,
       executorPodChecker: Pod => Unit = doBasicExecutorPodCheck,
       appArgs: Array[String] = Array.empty[String],
-      isJVM: Boolean = true ): Unit = {
+      isJVM: Boolean = true,
+      sparkConfOverride: Option[SparkAppConf] = None,
+      appLocatorOverride: Option[String] = None): Unit = {

Review comment:
       The previous integrations test are single job based.
   
   Thus a suite can only submit one spark job, this adds the ability to create serveral jobs in a single suite with user specified conf, if not specified, will back to use the sparkAppConf.
   
   Also search `sparkConfOverride.getOrElse(sparkAppConf)` for understand.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r816393633



##########
File path: resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/VolcanoTestsSuite.scala
##########
@@ -63,8 +150,40 @@ private[spark] trait VolcanoTestsSuite { k8sSuite: KubernetesSuite =>
       }
     )
   }
+
+  test("Run 4 SparkPi Jobs with 2 volcano queues (queue scheduling)", k8sTestTag, volcanoTag) {
+    createOrReplaceYAMLResource(VOLCANO_Q0_DISABLE_Q1_ENABLE_YAML)
+    val jobNum = 4
+    val groupLocator = "queue-test" + UUID.randomUUID().toString.replaceAll("-", "")
+    // Submit two jobs into disabled queue0 and enabled queue1
+    (1 to jobNum).foreach { i =>
+      Future {
+        runJobAndVerify(i.toString, Option(groupLocator), Option(s"queue${i % 2}"))
+      }
+    }
+    // There are two `Succeeded` jobs and two `Pending` jobs
+    Eventually.eventually(TIMEOUT, INTERVAL) {
+      val completedPods = getPods("driver", groupLocator, "Succeeded")
+      assert(completedPods.size === 2)
+      val pendingPods = getPods("driver", groupLocator, "Pending")
+      assert(pendingPods.size === 2)
+    }

Review comment:
       If possible, please two group locators. One for Q0 and the other for Q1.  That would be easier.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on a change in pull request #35553: [SPARK-38188][K8S] Support `spark.kubernetes.job.queue`

Posted by GitBox <gi...@apache.org>.
Yikun commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r816453791



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala
##########
@@ -292,6 +292,14 @@ private[spark] object Config extends Logging {
       .stringConf
       .createOptional
 
+  val KUBERNETES_JOB_QUEUE = ConfigBuilder("spark.kubernetes.job.queue")
+    .doc("The name of the queue to which the job is submitted. This info " +
+      "will be stored in configuration and passed to specified feature step (such as " +

Review comment:
       I changed this to `specific feature step`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35553: [SPARK-38188][K8S] Support `spark.kubernetes.job.queue`

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r816404103



##########
File path: resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/VolcanoTestsSuite.scala
##########
@@ -16,16 +16,33 @@
  */
 package org.apache.spark.deploy.k8s.integrationtest
 
+import java.io.{File, FileInputStream}
+import java.util.UUID
+
+import scala.collection.JavaConverters._
+import scala.collection.mutable
+import scala.concurrent._
+
+import ExecutionContext.Implicits.global
 import io.fabric8.kubernetes.api.model.Pod
+import io.fabric8.kubernetes.client.NamespacedKubernetesClient
 import io.fabric8.volcano.client.VolcanoClient
+import org.scalatest.concurrent.Eventually
 
 import org.apache.spark.SparkFunSuite
+import org.apache.spark.deploy.k8s.Config._
 import org.apache.spark.deploy.k8s.features.VolcanoFeatureStep
-import org.apache.spark.deploy.k8s.integrationtest.KubernetesSuite.k8sTestTag
-import org.apache.spark.deploy.k8s.integrationtest.VolcanoSuite.volcanoTag
+import org.apache.spark.internal.config.NETWORK_AUTH_ENABLED
 
 private[spark] trait VolcanoTestsSuite { k8sSuite: KubernetesSuite =>
+  import VolcanoSuite._

Review comment:
       Please remove this and recover `import org.apache.spark.deploy.k8s.integrationtest.VolcanoSuite.volcanoTag`. That's better than this wild-card import.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r809350603



##########
File path: resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/VolcanoTestsSuite.scala
##########
@@ -16,16 +16,33 @@
  */
 package org.apache.spark.deploy.k8s.integrationtest
 
+import java.io.{File, FileInputStream}
+import java.util.UUID
+
+import scala.collection.JavaConverters._
+import scala.collection.mutable
+import scala.concurrent._
+
+import ExecutionContext.Implicits.global
 import io.fabric8.kubernetes.api.model.Pod
+import io.fabric8.kubernetes.client.NamespacedKubernetesClient
 import io.fabric8.volcano.client.VolcanoClient
+import org.scalatest.concurrent.Eventually
 
 import org.apache.spark.SparkFunSuite
+import org.apache.spark.deploy.k8s.Config._
 import org.apache.spark.deploy.k8s.features.VolcanoFeatureStep
-import org.apache.spark.deploy.k8s.integrationtest.KubernetesSuite.k8sTestTag
-import org.apache.spark.deploy.k8s.integrationtest.VolcanoSuite.volcanoTag
+import org.apache.spark.internal.config.NETWORK_AUTH_ENABLED
 
 private[spark] trait VolcanoTestsSuite { k8sSuite: KubernetesSuite =>
+  import VolcanoSuite._
   import VolcanoTestsSuite._
+  import KubernetesSuite._
+
+  lazy val volcanoClient: VolcanoClient
+  = kubernetesTestComponents.kubernetesClient.adapt(classOf[VolcanoClient])
+  lazy val k8sClient: NamespacedKubernetesClient
+  = kubernetesTestComponents.kubernetesClient

Review comment:
       ditto.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] martin-g commented on a change in pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
martin-g commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r809739896



##########
File path: resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/KubernetesSuite.scala
##########
@@ -212,7 +212,9 @@ class KubernetesSuite extends SparkFunSuite
       driverPodChecker: Pod => Unit = doBasicDriverPodCheck,
       executorPodChecker: Pod => Unit = doBasicExecutorPodCheck,
       appArgs: Array[String] = Array.empty[String],
-      isJVM: Boolean = true ): Unit = {
+      isJVM: Boolean = true,
+      sparkConf: Option[SparkAppConf] = None,
+      appLoc: Option[String] = None): Unit = {

Review comment:
       `appLoc` is not very clear name (at least to me). I had to scroll down to see that it is used for `spark-app-locator`.
   Maybe rename to `appLocatorOverride` ?!
   If you like the idea then maybe the same for `sparkConf` -> `sparkConfOverride`

##########
File path: resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/VolcanoTestsSuite.scala
##########
@@ -16,16 +16,33 @@
  */
 package org.apache.spark.deploy.k8s.integrationtest
 
+import java.io.{File, FileInputStream}
+import java.util.UUID
+
+import scala.collection.JavaConverters._
+import scala.collection.mutable
+import scala.concurrent._
+
+import ExecutionContext.Implicits.global
 import io.fabric8.kubernetes.api.model.Pod
+import io.fabric8.kubernetes.client.NamespacedKubernetesClient
 import io.fabric8.volcano.client.VolcanoClient
+import org.scalatest.concurrent.Eventually
 
 import org.apache.spark.SparkFunSuite
+import org.apache.spark.deploy.k8s.Config._
 import org.apache.spark.deploy.k8s.features.VolcanoFeatureStep
-import org.apache.spark.deploy.k8s.integrationtest.KubernetesSuite.k8sTestTag
-import org.apache.spark.deploy.k8s.integrationtest.VolcanoSuite.volcanoTag
+import org.apache.spark.internal.config.NETWORK_AUTH_ENABLED
 
 private[spark] trait VolcanoTestsSuite { k8sSuite: KubernetesSuite =>
+  import VolcanoSuite._
   import VolcanoTestsSuite._
+  import KubernetesSuite._
+
+  lazy val volcanoClient: VolcanoClient
+    = kubernetesTestComponents.kubernetesClient.adapt(classOf[VolcanoClient])

Review comment:
       With https://github.com/apache/spark/pull/35555 I suggested to close the KubernetesClient in `cleanUp()`. Is the same needed for `volcanoClient` ?

##########
File path: resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/KubernetesSuite.scala
##########
@@ -221,7 +223,10 @@ class KubernetesSuite extends SparkFunSuite
       appArgs,
       driverPodChecker,
       executorPodChecker,
-      isJVM)
+      isJVM,
+      sparkConf = sparkConf,

Review comment:
       nit: no need of the RHS

##########
File path: resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/VolcanoTestsSuite.scala
##########
@@ -37,12 +54,81 @@ private[spark] trait VolcanoTestsSuite { k8sSuite: KubernetesSuite =>
     assert(annotations.get("scheduling.k8s.io/group-name") === s"$appId-podgroup")
   }
 
-  protected def checkPodGroup(pod: Pod): Unit = {
+  protected def checkPodGroup(
+      pod: Pod,
+      queue: Option[String] = None): Unit = {
     val appId = pod.getMetadata.getLabels.get("spark-app-selector")
     val podGroupName = s"$appId-podgroup"
-    val volcanoClient = kubernetesTestComponents.kubernetesClient.adapt(classOf[VolcanoClient])
     val podGroup = volcanoClient.podGroups().withName(podGroupName).get()
     assert(podGroup.getMetadata.getOwnerReferences.get(0).getName === pod.getMetadata.getName)
+    val spec = podGroup.getSpec
+    if (queue.isDefined) assert(spec.getQueue === queue.get)
+  }
+
+  private def createOrReplaceYAMLResource(yamlPath: String): Unit = {
+    k8sClient.load(new FileInputStream(yamlPath)).createOrReplace()
+  }
+
+  private def deleteYAMLResource(yamlPath: String): Unit = {
+    k8sClient.load(new FileInputStream(yamlPath)).delete()
+  }
+
+  private def getPods(role: String, groupLoc: String, statusPhase: String): mutable.Buffer[Pod] = {

Review comment:
       s/groupLoc/groupLocator/ ?

##########
File path: resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/VolcanoTestsSuite.scala
##########
@@ -37,12 +54,81 @@ private[spark] trait VolcanoTestsSuite { k8sSuite: KubernetesSuite =>
     assert(annotations.get("scheduling.k8s.io/group-name") === s"$appId-podgroup")
   }
 
-  protected def checkPodGroup(pod: Pod): Unit = {
+  protected def checkPodGroup(
+      pod: Pod,
+      queue: Option[String] = None): Unit = {
     val appId = pod.getMetadata.getLabels.get("spark-app-selector")
     val podGroupName = s"$appId-podgroup"
-    val volcanoClient = kubernetesTestComponents.kubernetesClient.adapt(classOf[VolcanoClient])
     val podGroup = volcanoClient.podGroups().withName(podGroupName).get()
     assert(podGroup.getMetadata.getOwnerReferences.get(0).getName === pod.getMetadata.getName)
+    val spec = podGroup.getSpec
+    if (queue.isDefined) assert(spec.getQueue === queue.get)
+  }
+
+  private def createOrReplaceYAMLResource(yamlPath: String): Unit = {
+    k8sClient.load(new FileInputStream(yamlPath)).createOrReplace()
+  }
+
+  private def deleteYAMLResource(yamlPath: String): Unit = {
+    k8sClient.load(new FileInputStream(yamlPath)).delete()
+  }
+
+  private def getPods(role: String, groupLoc: String, statusPhase: String): mutable.Buffer[Pod] = {
+    k8sClient
+      .pods()
+      .withLabel("spark-group-locator", groupLoc)
+      .withLabel("spark-role", role)
+      .withField("status.phase", statusPhase)
+      .list()
+      .getItems.asScala
+  }
+
+  def runJobAndVerify(
+      batchSuffix: String,
+      groupLoc: Option[String] = None,
+      queue: Option[String] = None): Unit = {
+    val appLoc = s"${appLocator}${batchSuffix}"
+    val podName = s"${driverPodName}-${batchSuffix}"
+    // create new configuration for every job
+    val conf = createVolcanoSparkConf(
+      driverPodName = podName, appLoc = appLoc, groupLoc = groupLoc, queue

Review comment:
       No need of the RHSs (`= appLoc`)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on a change in pull request #35553: [SPARK-38188][K8S] Support `spark.kubernetes.job.queue`

Posted by GitBox <gi...@apache.org>.
Yikun commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r816453791



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala
##########
@@ -292,6 +292,14 @@ private[spark] object Config extends Logging {
       .stringConf
       .createOptional
 
+  val KUBERNETES_JOB_QUEUE = ConfigBuilder("spark.kubernetes.job.queue")
+    .doc("The name of the queue to which the job is submitted. This info " +
+      "will be stored in configuration and passed to specified feature step (such as " +

Review comment:
       Change this to `specific feature step`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on a change in pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
Yikun commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r811154925



##########
File path: resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/KubernetesSuite.scala
##########
@@ -221,7 +223,10 @@ class KubernetesSuite extends SparkFunSuite
       appArgs,
       driverPodChecker,
       executorPodChecker,
-      isJVM)
+      isJVM,
+      sparkConf = sparkConf,

Review comment:
       we need this, but no need in blow. :)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun closed pull request #35553: [SPARK-38188][K8S] Support `spark.kubernetes.job.queue`

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun closed pull request #35553:
URL: https://github.com/apache/spark/pull/35553


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dcoliversun commented on a change in pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
dcoliversun commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r813479796



##########
File path: resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/KubernetesSuite.scala
##########
@@ -212,7 +212,9 @@ class KubernetesSuite extends SparkFunSuite
       driverPodChecker: Pod => Unit = doBasicDriverPodCheck,
       executorPodChecker: Pod => Unit = doBasicExecutorPodCheck,
       appArgs: Array[String] = Array.empty[String],
-      isJVM: Boolean = true ): Unit = {
+      isJVM: Boolean = true,
+      sparkConfOverride: Option[SparkAppConf] = None,
+      appLocatorOverride: Option[String] = None): Unit = {

Review comment:
       I understand. I don't think it's best to name it as `override`. How about `specifiedSparkConf` ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dcoliversun commented on pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
dcoliversun commented on pull request #35553:
URL: https://github.com/apache/spark/pull/35553#issuecomment-1049780401


   LGTM. @Yikun Great Work.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
Yikun commented on pull request #35553:
URL: https://github.com/apache/spark/pull/35553#issuecomment-1054019417


   @holdenk As we dicussion offline before, this also introduced some test framework change, it's also the pre-PR for followup two PRS. It would be good if you could get some time to review this first. : )


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun edited a comment on pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
Yikun edited a comment on pull request #35553:
URL: https://github.com/apache/spark/pull/35553#issuecomment-1043588796


   Thanks for your reminder @dongjoon-hyun 
   
   also ping @holdenk @tgravescs @mridulm @yangwwei @yaooqinn @rdblue @HyukjinKwon 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r816392731



##########
File path: resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/VolcanoTestsSuite.scala
##########
@@ -37,12 +54,82 @@ private[spark] trait VolcanoTestsSuite { k8sSuite: KubernetesSuite =>
     assert(annotations.get("scheduling.k8s.io/group-name") === s"$appId-podgroup")
   }
 
-  protected def checkPodGroup(pod: Pod): Unit = {
+  protected def checkPodGroup(
+      pod: Pod,
+      queue: Option[String] = None): Unit = {
     val appId = pod.getMetadata.getLabels.get("spark-app-selector")
     val podGroupName = s"$appId-podgroup"
-    val volcanoClient = kubernetesTestComponents.kubernetesClient.adapt(classOf[VolcanoClient])
     val podGroup = volcanoClient.podGroups().withName(podGroupName).get()
     assert(podGroup.getMetadata.getOwnerReferences.get(0).getName === pod.getMetadata.getName)
+    val spec = podGroup.getSpec
+    if (queue.isDefined) assert(spec.getQueue === queue.get)
+  }
+
+  private def createOrReplaceYAMLResource(yamlPath: String): Unit = {
+    k8sClient.load(new FileInputStream(yamlPath)).createOrReplace()
+  }
+
+  private def deleteYAMLResource(yamlPath: String): Unit = {
+    k8sClient.load(new FileInputStream(yamlPath)).delete()
+  }
+
+  private def getPods(
+      role: String,
+      groupLocator: String,
+      statusPhase: String): mutable.Buffer[Pod] = {
+    k8sClient
+      .pods()
+      .withLabel("spark-group-locator", groupLocator)
+      .withLabel("spark-role", role)
+      .withField("status.phase", statusPhase)
+      .list()
+      .getItems.asScala
+  }
+
+  def runJobAndVerify(
+      batchSuffix: String,
+      groupLoc: Option[String] = None,
+      queue: Option[String] = None): Unit = {
+    val appLoc = s"${appLocator}${batchSuffix}"
+    val podName = s"${driverPodName}-${batchSuffix}"
+    // create new configuration for every job
+    val conf = createVolcanoSparkConf(podName, appLoc, groupLoc, queue)
+    runSparkPiAndVerifyCompletion(
+      driverPodChecker = (driverPod: Pod) => {
+        checkScheduler(driverPod)
+        checkAnnotaion(driverPod)
+        checkPodGroup(driverPod, queue)
+      },
+      executorPodChecker = (executorPod: Pod) => {
+        checkScheduler(executorPod)
+        checkAnnotaion(executorPod)
+      },
+      customSparkConf = Option(conf),
+      customAppLocator = Option(appLoc)
+    )
+  }
+
+  private def createVolcanoSparkConf(
+      driverPodName: String = driverPodName,
+      appLoc: String = appLocator,
+      groupLoc: Option[String] = None,
+      queue: Option[String] = None): SparkAppConf = {
+    val conf = kubernetesTestComponents.newSparkAppConf()
+      .set(CONTAINER_IMAGE.key, image)
+      .set(KUBERNETES_DRIVER_POD_NAME.key, driverPodName)
+      .set(s"${KUBERNETES_DRIVER_LABEL_PREFIX}spark-app-locator", appLoc)
+      .set(s"${KUBERNETES_EXECUTOR_LABEL_PREFIX}spark-app-locator", appLoc)
+      .set(NETWORK_AUTH_ENABLED.key, "true")
+      // below is volcano specific configuration
+      .set(KUBERNETES_SCHEDULER_NAME.key, "volcano")
+      .set(KUBERNETES_DRIVER_POD_FEATURE_STEPS.key, VOLCANO_FEATURE_STEP)
+      .set(KUBERNETES_EXECUTOR_POD_FEATURE_STEPS.key, VOLCANO_FEATURE_STEP)
+    if (queue.isDefined) conf.set(KUBERNETES_JOB_QUEUE.key, queue.get)

Review comment:
       We prefer the following style, @Yikun . 
   ```scala
   -    if (queue.isDefined) conf.set(KUBERNETES_JOB_QUEUE.key, queue.get)
   +    queue.foreach(conf.set(KUBERNETES_JOB_QUEUE.key, _))
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r816401646



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala
##########
@@ -292,6 +292,14 @@ private[spark] object Config extends Logging {
       .stringConf
       .createOptional
 
+  val KUBERNETES_JOB_QUEUE = ConfigBuilder("spark.kubernetes.job.queue")
+    .doc("The name of the queue to which the job is submitted. This info " +
+      "will be stored in configuration and passed to specified feature step (such as " +

Review comment:
       `specified feature step` is ambiguous if this is handed over to all feature steps.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] martin-g commented on a change in pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
martin-g commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r813669117



##########
File path: resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/KubernetesSuite.scala
##########
@@ -212,7 +212,9 @@ class KubernetesSuite extends SparkFunSuite
       driverPodChecker: Pod => Unit = doBasicDriverPodCheck,
       executorPodChecker: Pod => Unit = doBasicExecutorPodCheck,
       appArgs: Array[String] = Array.empty[String],
-      isJVM: Boolean = true ): Unit = {
+      isJVM: Boolean = true,
+      sparkConfOverride: Option[SparkAppConf] = None,
+      appLocatorOverride: Option[String] = None): Unit = {

Review comment:
       I was the one who suggested `sparkConfOverride`. I agree that it might be confusing. Here is another alternative - `customSparkConf` ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on a change in pull request #35553: [SPARK-38188][K8S] Support `spark.kubernetes.job.queue`

Posted by GitBox <gi...@apache.org>.
Yikun commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r816451073



##########
File path: resource-managers/kubernetes/integration-tests/src/test/resources/volcano/disable-queue0-enable-queue1.yml
##########
@@ -0,0 +1,29 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+apiVersion: scheduling.volcano.sh/v1beta1

Review comment:
       At present, there is no certain timeline for apiVersion, but it is in roadmap, but at least there will be no change before Spark 3.3 release. also cc @william-wang




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun edited a comment on pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
Yikun edited a comment on pull request #35553:
URL: https://github.com/apache/spark/pull/35553#issuecomment-1043588796


   Thanks for your reminder @dongjoon-hyun 
   
   also ping @holdenk @tgravescs @mridulm 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r816401757



##########
File path: docs/running-on-kubernetes.md
##########
@@ -1356,6 +1356,15 @@ See the [configuration page](configuration.html) for information on Spark config
   </td>
   <td>3.3.0</td>
 </tr>
+<tr>
+  <td><code>spark.kubernetes.job.queue</code></td>
+  <td><code>(none)</code></td>
+  <td>
+    The name of the queue to which the job is submitted. This info will be stored in configuration
+    and passed to specified feature step (such as `VolcanoFeatureStep`).

Review comment:
       Please remove
   ```
    (such as `VolcanoFeatureStep`)
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r816402623



##########
File path: docs/running-on-kubernetes.md
##########
@@ -1356,6 +1356,15 @@ See the [configuration page](configuration.html) for information on Spark config
   </td>
   <td>3.3.0</td>
 </tr>
+<tr>
+  <td><code>spark.kubernetes.job.queue</code></td>
+  <td><code>(none)</code></td>

Review comment:
       Shall we follow the other `(none)`s in this document? It seems that we don't codify `(none)` here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r809349027



##########
File path: docs/running-on-kubernetes.md
##########
@@ -1356,6 +1356,15 @@ See the [configuration page](configuration.html) for information on Spark config
   </td>
   <td>3.3.0</td>
 </tr>
+<tr>
+  <td><code>spark.kubernetes.job.queue</code></td>
+  <td><code>default</code></td>

Review comment:
       Well, I guess this should be empty.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on a change in pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
Yikun commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r812964505



##########
File path: resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/KubernetesSuite.scala
##########
@@ -212,7 +212,9 @@ class KubernetesSuite extends SparkFunSuite
       driverPodChecker: Pod => Unit = doBasicDriverPodCheck,
       executorPodChecker: Pod => Unit = doBasicExecutorPodCheck,
       appArgs: Array[String] = Array.empty[String],
-      isJVM: Boolean = true ): Unit = {
+      isJVM: Boolean = true,
+      sparkConfOverride: Option[SparkAppConf] = None,
+      appLocatorOverride: Option[String] = None): Unit = {

Review comment:
       The previous integrations test are single job based.
   
   Thus a suite can only submit one spark job, this adds the ability to create serveral jobs  with user specified conf in a single suite, if not specified, will back to use the `sparkAppConf`.
   
   Also search `sparkConfOverride.getOrElse(sparkAppConf)` for understand.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on a change in pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
Yikun commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r813779244



##########
File path: resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/KubernetesSuite.scala
##########
@@ -212,7 +212,9 @@ class KubernetesSuite extends SparkFunSuite
       driverPodChecker: Pod => Unit = doBasicDriverPodCheck,
       executorPodChecker: Pod => Unit = doBasicExecutorPodCheck,
       appArgs: Array[String] = Array.empty[String],
-      isJVM: Boolean = true ): Unit = {
+      isJVM: Boolean = true,
+      sparkConfOverride: Option[SparkAppConf] = None,
+      appLocatorOverride: Option[String] = None): Unit = {

Review comment:
       Thanks, done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dcoliversun commented on a change in pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
dcoliversun commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r813471699



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/VolcanoFeatureStep.scala
##########
@@ -20,6 +20,7 @@ import io.fabric8.kubernetes.api.model._
 import io.fabric8.volcano.scheduling.v1beta1.PodGroupBuilder
 
 import org.apache.spark.deploy.k8s.{KubernetesConf, KubernetesDriverConf, KubernetesExecutorConf, SparkPod}
+import org.apache.spark.deploy.k8s.Config._

Review comment:
       OK for me :)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r816387170



##########
File path: resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/VolcanoTestsSuite.scala
##########
@@ -16,16 +16,33 @@
  */
 package org.apache.spark.deploy.k8s.integrationtest
 
+import java.io.{File, FileInputStream}
+import java.util.UUID
+
+import scala.collection.JavaConverters._
+import scala.collection.mutable
+import scala.concurrent._
+
+import ExecutionContext.Implicits.global
 import io.fabric8.kubernetes.api.model.Pod
+import io.fabric8.kubernetes.client.NamespacedKubernetesClient
 import io.fabric8.volcano.client.VolcanoClient
+import org.scalatest.concurrent.Eventually
 
 import org.apache.spark.SparkFunSuite
+import org.apache.spark.deploy.k8s.Config._
 import org.apache.spark.deploy.k8s.features.VolcanoFeatureStep
-import org.apache.spark.deploy.k8s.integrationtest.KubernetesSuite.k8sTestTag
-import org.apache.spark.deploy.k8s.integrationtest.VolcanoSuite.volcanoTag
+import org.apache.spark.internal.config.NETWORK_AUTH_ENABLED
 
 private[spark] trait VolcanoTestsSuite { k8sSuite: KubernetesSuite =>
+  import VolcanoSuite._
   import VolcanoTestsSuite._
+  import KubernetesSuite._
+
+  lazy val volcanoClient: VolcanoClient
+    = kubernetesTestComponents.kubernetesClient.adapt(classOf[VolcanoClient])
+  lazy val k8sClient: NamespacedKubernetesClient
+    = kubernetesTestComponents.kubernetesClient

Review comment:
       One line please.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on a change in pull request #35553: [SPARK-38188][K8S] Support `spark.kubernetes.job.queue`

Posted by GitBox <gi...@apache.org>.
Yikun commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r816452421



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala
##########
@@ -292,6 +292,14 @@ private[spark] object Config extends Logging {
       .stringConf
       .createOptional
 
+  val KUBERNETES_JOB_QUEUE = ConfigBuilder("spark.kubernetes.job.queue")

Review comment:
       I can get your concern in here, I think it's a good note. : )
   
   The short plan is we need to also add `minCPU`/`priorityClassName` in here, and there are also some `job` related introduce in future.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r816384203



##########
File path: resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/VolcanoTestsSuite.scala
##########
@@ -63,8 +150,40 @@ private[spark] trait VolcanoTestsSuite { k8sSuite: KubernetesSuite =>
       }
     )
   }
+
+  test("Run 4 SparkPi Jobs with 2 volcano queues (queue scheduling)", k8sTestTag, volcanoTag) {

Review comment:
       Although this is not mandatory for this case, could you add `SPARK-38188:`? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on a change in pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
Yikun commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r812964505



##########
File path: resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/KubernetesSuite.scala
##########
@@ -212,7 +212,9 @@ class KubernetesSuite extends SparkFunSuite
       driverPodChecker: Pod => Unit = doBasicDriverPodCheck,
       executorPodChecker: Pod => Unit = doBasicExecutorPodCheck,
       appArgs: Array[String] = Array.empty[String],
-      isJVM: Boolean = true ): Unit = {
+      isJVM: Boolean = true,
+      sparkConfOverride: Option[SparkAppConf] = None,
+      appLocatorOverride: Option[String] = None): Unit = {

Review comment:
       The previous integrations test are single job based.
   
   Thus a suite can only submit one spark job, this adds the ability to create serveral jobs in a single suite with user specified conf, if not specified, will back to use the `sparkAppConf`.
   
   Also search `sparkConfOverride.getOrElse(sparkAppConf)` for understand.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on a change in pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
Yikun commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r811204434



##########
File path: resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/VolcanoTestsSuite.scala
##########
@@ -16,16 +16,33 @@
  */
 package org.apache.spark.deploy.k8s.integrationtest
 
+import java.io.{File, FileInputStream}
+import java.util.UUID
+
+import scala.collection.JavaConverters._
+import scala.collection.mutable
+import scala.concurrent._
+
+import ExecutionContext.Implicits.global
 import io.fabric8.kubernetes.api.model.Pod
+import io.fabric8.kubernetes.client.NamespacedKubernetesClient
 import io.fabric8.volcano.client.VolcanoClient
+import org.scalatest.concurrent.Eventually
 
 import org.apache.spark.SparkFunSuite
+import org.apache.spark.deploy.k8s.Config._
 import org.apache.spark.deploy.k8s.features.VolcanoFeatureStep
-import org.apache.spark.deploy.k8s.integrationtest.KubernetesSuite.k8sTestTag
-import org.apache.spark.deploy.k8s.integrationtest.VolcanoSuite.volcanoTag
+import org.apache.spark.internal.config.NETWORK_AUTH_ENABLED
 
 private[spark] trait VolcanoTestsSuite { k8sSuite: KubernetesSuite =>
+  import VolcanoSuite._
   import VolcanoTestsSuite._
+  import KubernetesSuite._
+
+  lazy val volcanoClient: VolcanoClient
+    = kubernetesTestComponents.kubernetesClient.adapt(classOf[VolcanoClient])

Review comment:
       I think no need, because volcano share the resource of `kubernetesTestComponents.kubernetesClient`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r816389589



##########
File path: resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/VolcanoTestsSuite.scala
##########
@@ -63,8 +150,40 @@ private[spark] trait VolcanoTestsSuite { k8sSuite: KubernetesSuite =>
       }
     )
   }
+
+  test("Run 4 SparkPi Jobs with 2 volcano queues (queue scheduling)", k8sTestTag, volcanoTag) {
+    createOrReplaceYAMLResource(VOLCANO_Q0_DISABLE_Q1_ENABLE_YAML)
+    val jobNum = 4
+    val groupLocator = "queue-test" + UUID.randomUUID().toString.replaceAll("-", "")
+    // Submit two jobs into disabled queue0 and enabled queue1
+    (1 to jobNum).foreach { i =>
+      Future {
+        runJobAndVerify(i.toString, Option(groupLocator), Option(s"queue${i % 2}"))
+      }
+    }
+    // There are two `Succeeded` jobs and two `Pending` jobs
+    Eventually.eventually(TIMEOUT, INTERVAL) {
+      val completedPods = getPods("driver", groupLocator, "Succeeded")
+      assert(completedPods.size === 2)
+      val pendingPods = getPods("driver", groupLocator, "Pending")
+      assert(pendingPods.size === 2)
+    }

Review comment:
       Actually, this is weak because this test will pass in a normal sequential scheduler too. We had better check the content of driver explicitly.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r816383473



##########
File path: resource-managers/kubernetes/integration-tests/src/test/resources/volcano/disable-queue0-enable-queue1.yml
##########
@@ -0,0 +1,29 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+apiVersion: scheduling.volcano.sh/v1beta1

Review comment:
       Just a question. What is the plan for removal of this `beta` in Volcano roadmap?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35553: [SPARK-38188][K8S] Support `spark.kubernetes.job.queue`

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r816404547



##########
File path: resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/VolcanoTestsSuite.scala
##########
@@ -16,16 +16,33 @@
  */
 package org.apache.spark.deploy.k8s.integrationtest
 
+import java.io.{File, FileInputStream}
+import java.util.UUID
+
+import scala.collection.JavaConverters._
+import scala.collection.mutable
+import scala.concurrent._
+
+import ExecutionContext.Implicits.global
 import io.fabric8.kubernetes.api.model.Pod
+import io.fabric8.kubernetes.client.NamespacedKubernetesClient
 import io.fabric8.volcano.client.VolcanoClient
+import org.scalatest.concurrent.Eventually
 
 import org.apache.spark.SparkFunSuite
+import org.apache.spark.deploy.k8s.Config._
 import org.apache.spark.deploy.k8s.features.VolcanoFeatureStep
-import org.apache.spark.deploy.k8s.integrationtest.KubernetesSuite.k8sTestTag
-import org.apache.spark.deploy.k8s.integrationtest.VolcanoSuite.volcanoTag
+import org.apache.spark.internal.config.NETWORK_AUTH_ENABLED
 
 private[spark] trait VolcanoTestsSuite { k8sSuite: KubernetesSuite =>
+  import VolcanoSuite._
   import VolcanoTestsSuite._
+  import KubernetesSuite._

Review comment:
       Remove this and use 
   ```
   import org.apache.spark.deploy.k8s.integrationtest.KubernetesSuite.{k8sTestTag, INTERVAL, TIMEOUT}
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35553: [SPARK-38188][K8S] Support `spark.kubernetes.job.queue`

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r816399653



##########
File path: resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/VolcanoTestsSuite.scala
##########
@@ -16,16 +16,33 @@
  */
 package org.apache.spark.deploy.k8s.integrationtest
 
+import java.io.{File, FileInputStream}
+import java.util.UUID
+
+import scala.collection.JavaConverters._
+import scala.collection.mutable
+import scala.concurrent._
+
+import ExecutionContext.Implicits.global

Review comment:
       Actually, SPARK-37891 discourage this. If you need to use this please wrap this with `// scalastyle:off executioncontextglobal` and on.
   ```scala
   // scalastyle:off executioncontextglobal
   import scala.concurrent.ExecutionContext.Implicits.global
   // scalastyle:on executioncontextglobal
   ```
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r816401946



##########
File path: docs/running-on-kubernetes.md
##########
@@ -1356,6 +1356,15 @@ See the [configuration page](configuration.html) for information on Spark config
   </td>
   <td>3.3.0</td>
 </tr>
+<tr>
+  <td><code>spark.kubernetes.job.queue</code></td>
+  <td><code>(none)</code></td>
+  <td>
+    The name of the queue to which the job is submitted. This info will be stored in configuration
+    and passed to specified feature step (such as `VolcanoFeatureStep`).

Review comment:
       Also, `specified feature step` is ambiguous if this configuration is progagated to all feature steps.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35553: [SPARK-38188][K8S] Support `spark.kubernetes.job.queue`

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r816403150



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala
##########
@@ -292,6 +292,14 @@ private[spark] object Config extends Logging {
       .stringConf
       .createOptional
 
+  val KUBERNETES_JOB_QUEUE = ConfigBuilder("spark.kubernetes.job.queue")

Review comment:
       As you know, if there is no other `job` config, this should be `jobQueue` instead of `job.queue`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
Yikun commented on pull request #35553:
URL: https://github.com/apache/spark/pull/35553#issuecomment-1043115602


   cc @dongjoon-hyun @martin-g @william-wang


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #35553:
URL: https://github.com/apache/spark/pull/35553#discussion_r809349339



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala
##########
@@ -292,6 +292,14 @@ private[spark] object Config extends Logging {
       .stringConf
       .createOptional
 
+  val KUBERNETES_JOB_QUEUE = ConfigBuilder("spark.kubernetes.job.queue")
+    .doc("The name of the queue to which the job is submitted. This info " +
+      "will be stored in configuration and passed to specified feature step (such as " +
+      "`VolcanoFeatureStep`).")
+    .version("3.3.0")
+    .stringConf
+    .createWithDefault("default")

Review comment:
       `createdOptional`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on pull request #35553: [SPARK-38188][K8S] Support queue scheduling with Volcano implementations

Posted by GitBox <gi...@apache.org>.
Yikun commented on pull request #35553:
URL: https://github.com/apache/spark/pull/35553#issuecomment-1042976528


   Kubernetes K8S integration test passed:
   ```
   $ build/sbt -Pvolcano -Pkubernetes -Pkubernetes-integration-tests -Dtest.exclude.tags=minikube,r -Dspark.kubernetes.test.namespace=default "kubernetes-integration-tests/test"
   
   [info] KubernetesSuite:
   [info] - Run SparkPi with no resources (10 seconds, 946 milliseconds)
   [info] - Run SparkPi with no resources & statefulset allocation (10 seconds, 714 milliseconds)
   [info] - Run SparkPi with a very long application name. (11 seconds, 494 milliseconds)
   [info] - Use SparkLauncher.NO_RESOURCE (10 seconds, 544 milliseconds)
   [info] - Run SparkPi with a master URL without a scheme. (11 seconds, 588 milliseconds)
   [info] - Run SparkPi with an argument. (11 seconds, 619 milliseconds)
   [info] - Run SparkPi with custom labels, annotations, and environment variables. (10 seconds, 555 milliseconds)
   [info] - All pods have the same service account by default (11 seconds, 490 milliseconds)
   [info] - Run extraJVMOptions check on driver (5 seconds, 467 milliseconds)
   [info] - Run SparkRemoteFileTest using a remote data file (10 seconds, 745 milliseconds)
   [info] - Verify logging configuration is picked from the provided SPARK_CONF_DIR/log4j2.properties (18 seconds, 999 milliseconds)
   [info] - Run SparkPi with env and mount secrets. (20 seconds, 363 milliseconds)
   [info] - Run PySpark on simple pi.py example (12 seconds, 653 milliseconds)
   [info] - Run PySpark to test a pyfiles example (14 seconds, 644 milliseconds)
   [info] - Run PySpark with memory customization (12 seconds, 467 milliseconds)
   [info] - Run in client mode. (9 seconds, 182 milliseconds)
   [info] - Start pod creation from template (11 seconds, 619 milliseconds)
   [info] - Test basic decommissioning (45 seconds, 783 milliseconds)
   [info] - Test basic decommissioning with shuffle cleanup (46 seconds, 198 milliseconds)
   [info] - Test decommissioning with dynamic allocation & shuffle cleanups (2 minutes, 45 seconds)
   [info] - Test decommissioning timeouts (45 seconds, 242 milliseconds)
   [info] - SPARK-37576: Rolling decommissioning (1 minute, 7 seconds)
   [info] VolcanoSuite:
   [info] - Run SparkPi with no resources (12 seconds, 548 milliseconds)
   [info] - Run SparkPi with no resources & statefulset allocation (11 seconds, 638 milliseconds)
   [info] - Run SparkPi with a very long application name. (11 seconds, 565 milliseconds)
   [info] - Use SparkLauncher.NO_RESOURCE (11 seconds, 492 milliseconds)
   [info] - Run SparkPi with a master URL without a scheme. (11 seconds, 465 milliseconds)
   [info] - Run SparkPi with an argument. (11 seconds, 617 milliseconds)
   [info] - Run SparkPi with custom labels, annotations, and environment variables. (12 seconds, 538 milliseconds)
   [info] - All pods have the same service account by default (11 seconds, 476 milliseconds)
   [info] - Run extraJVMOptions check on driver (6 seconds, 457 milliseconds)
   [info] - Run SparkRemoteFileTest using a remote data file (12 seconds, 536 milliseconds)
   [info] - Verify logging configuration is picked from the provided SPARK_CONF_DIR/log4j2.properties (18 seconds, 953 milliseconds)
   [info] - Run SparkPi with env and mount secrets. (20 seconds, 923 milliseconds)
   [info] - Run PySpark on simple pi.py example (13 seconds, 594 milliseconds)
   [info] - Run PySpark to test a pyfiles example (16 seconds, 584 milliseconds)
   [info] - Run PySpark with memory customization (12 seconds, 466 milliseconds)
   [info] - Run in client mode. (9 seconds, 125 milliseconds)
   [info] - Start pod creation from template (11 seconds, 642 milliseconds)
   [info] - Test basic decommissioning (46 seconds, 809 milliseconds)
   [info] - Test basic decommissioning with shuffle cleanup (48 seconds, 186 milliseconds)
   [info] - Test decommissioning with dynamic allocation & shuffle cleanups (2 minutes, 47 seconds)
   [info] - Test decommissioning timeouts (47 seconds, 200 milliseconds)
   [info] - SPARK-37576: Rolling decommissioning (1 minute, 8 seconds)
   [info] - Run SparkPi with volcano scheduler (11 seconds, 693 milliseconds)
   [info] - Run 4 SparkPi Jobs with 2 volcano queues (queue scheduling) (30 seconds, 590 milliseconds)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org