You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/03/16 12:16:18 UTC

[GitHub] [spark] martin-g commented on a change in pull request #35870: [SPARK-38562][K8S][DOCS] Add doc for `Volcano` scheduler

martin-g commented on a change in pull request #35870:
URL: https://github.com/apache/spark/pull/35870#discussion_r827913886



##########
File path: docs/running-on-kubernetes.md
##########
@@ -1722,6 +1722,83 @@ spec:
     image: will-be-overwritten
 ```
 
+#### Using Volcano as Customized Scheduler for Spark on Kubernetes
+
+##### Prerequisites
+* Volcano supports Spark on Kubernetes since v1.5. Mini version: v1.5.1+. See also [Volcano installation](https://volcano.sh/en/docs/installation).

Review comment:
       ```suggestion
   * Spark on Kubernetes with Volcano as a scheduler is supported since Spark v3.3.0 and Volcano v1.5.1. See also [Volcano installation](https://volcano.sh/en/docs/installation).
   ```

##########
File path: docs/running-on-kubernetes.md
##########
@@ -1722,6 +1722,83 @@ spec:
     image: will-be-overwritten
 ```
 
+#### Using Volcano as Customized Scheduler for Spark on Kubernetes
+
+##### Prerequisites
+* Volcano supports Spark on Kubernetes since v1.5. Mini version: v1.5.1+. See also [Volcano installation](https://volcano.sh/en/docs/installation).
+
+##### Usage
+Spark on Kubernetes allows using Volcano as a customized scheduler. Users can use Volcano to

Review comment:
       ```suggestion
   Spark on Kubernetes allows using Volcano as a custom scheduler. Users can use Volcano to
   ```

##########
File path: docs/running-on-kubernetes.md
##########
@@ -1722,6 +1722,83 @@ spec:
     image: will-be-overwritten
 ```
 
+#### Using Volcano as Customized Scheduler for Spark on Kubernetes
+
+##### Prerequisites
+* Volcano supports Spark on Kubernetes since v1.5. Mini version: v1.5.1+. See also [Volcano installation](https://volcano.sh/en/docs/installation).
+
+##### Usage
+Spark on Kubernetes allows using Volcano as a customized scheduler. Users can use Volcano to
+support more advanced resource scheduling: queue scheduling, resource reservation, priority scheduling, for example:
+
+```
+# Specify volcano scheduler
+--conf spark.kubernetes.scheduler.name=volcano
+# Specify driver/executor VolcanoFeatureStep
+--conf spark.kubernetes.driver.pod.featureSteps=org.apache.spark.deploy.k8s.features.VolcanoFeatureStep
+--conf spark.kubernetes.executor.pod.featureSteps=org.apache.spark.deploy.k8s.features.VolcanoFeatureStep
+# Specify PodGroup template
+--conf spark.kubernetes.scheduler.volcano.podGroupTemplateFile=/path/to/podgroup-template.yaml
+```
+
+##### Volcano Feature Step
+Volcano feature steps help users to create Volcano PodGroup and set driver/executor pod annotation to link this PodGroup.
+
+Note that, currently only supported driver/job level PodGroup in Volcano Feature Step, executor separate PodGroup is not supported yet.
+
+##### Volcano PodGroup Template
+Volcano defines PodGroup spec using [CRD yaml](https://volcano.sh/en/docs/podgroup/#example)
+
+Similar to [Pod template](#pod-template), Spark users can similarly use Volcano PodGroup Template to define the PodGroup spec configurations.
+
+To do so, specify the spark properties `spark.kubernetes.scheduler.volcano.podGroupTemplateFile` to point to files accessible to the `spark-submit` process.

Review comment:
       ```suggestion
   To do so, specify the Spark properties `spark.kubernetes.scheduler.volcano.podGroupTemplateFile` to point to files accessible to the `spark-submit` process.
   ```

##########
File path: docs/running-on-kubernetes.md
##########
@@ -1722,6 +1722,83 @@ spec:
     image: will-be-overwritten
 ```
 
+#### Using Volcano as Customized Scheduler for Spark on Kubernetes
+
+##### Prerequisites
+* Volcano supports Spark on Kubernetes since v1.5. Mini version: v1.5.1+. See also [Volcano installation](https://volcano.sh/en/docs/installation).
+
+##### Usage
+Spark on Kubernetes allows using Volcano as a customized scheduler. Users can use Volcano to
+support more advanced resource scheduling: queue scheduling, resource reservation, priority scheduling, for example:

Review comment:
       ```suggestion
   support more advanced resource scheduling: queue scheduling, resource reservation, priority scheduling, and more.
   ```

##########
File path: docs/running-on-kubernetes.md
##########
@@ -1722,6 +1722,83 @@ spec:
     image: will-be-overwritten
 ```
 
+#### Using Volcano as Customized Scheduler for Spark on Kubernetes
+
+##### Prerequisites
+* Volcano supports Spark on Kubernetes since v1.5. Mini version: v1.5.1+. See also [Volcano installation](https://volcano.sh/en/docs/installation).
+
+##### Usage
+Spark on Kubernetes allows using Volcano as a customized scheduler. Users can use Volcano to
+support more advanced resource scheduling: queue scheduling, resource reservation, priority scheduling, for example:
+
+```
+# Specify volcano scheduler
+--conf spark.kubernetes.scheduler.name=volcano
+# Specify driver/executor VolcanoFeatureStep
+--conf spark.kubernetes.driver.pod.featureSteps=org.apache.spark.deploy.k8s.features.VolcanoFeatureStep
+--conf spark.kubernetes.executor.pod.featureSteps=org.apache.spark.deploy.k8s.features.VolcanoFeatureStep
+# Specify PodGroup template
+--conf spark.kubernetes.scheduler.volcano.podGroupTemplateFile=/path/to/podgroup-template.yaml
+```
+
+##### Volcano Feature Step
+Volcano feature steps help users to create Volcano PodGroup and set driver/executor pod annotation to link this PodGroup.
+
+Note that, currently only supported driver/job level PodGroup in Volcano Feature Step, executor separate PodGroup is not supported yet.
+
+##### Volcano PodGroup Template
+Volcano defines PodGroup spec using [CRD yaml](https://volcano.sh/en/docs/podgroup/#example)
+
+Similar to [Pod template](#pod-template), Spark users can similarly use Volcano PodGroup Template to define the PodGroup spec configurations.
+
+To do so, specify the spark properties `spark.kubernetes.scheduler.volcano.podGroupTemplateFile` to point to files accessible to the `spark-submit` process.
+
+Below is an example of PodGroup template, see also [PodGroup Introduction](https://volcano.sh/en/docs/podgroup/#introduction):
+
+```
+apiVersion: scheduling.volcano.sh/v1beta1
+kind: PodGroup
+spec:
+  # Specify minMember to 1 to make driver
+  minMember: 1
+  # Specify minResources to support resource reservation
+  minResources:
+    cpu: "2"
+    memory: "3Gi"
+  # Specify the priority
+  priorityClassName: high-priority
+  queue: default
+```
+
+##### Features
+<table class="table">
+<tr><th>Scheduling</th><th>Description</th><th>Configuration</th></tr>
+<tr>
+  <td>Queue scheduling</td>
+  <td>
+    Queue indicates the resource queue, which adopts FIFO. is also used as the basis for resource division.

Review comment:
       Maybe also add a link to Volcano documentation about Queue ?

##########
File path: docs/running-on-kubernetes.md
##########
@@ -1722,6 +1722,83 @@ spec:
     image: will-be-overwritten
 ```
 
+#### Using Volcano as Customized Scheduler for Spark on Kubernetes
+
+##### Prerequisites
+* Volcano supports Spark on Kubernetes since v1.5. Mini version: v1.5.1+. See also [Volcano installation](https://volcano.sh/en/docs/installation).
+
+##### Usage
+Spark on Kubernetes allows using Volcano as a customized scheduler. Users can use Volcano to
+support more advanced resource scheduling: queue scheduling, resource reservation, priority scheduling, for example:
+
+```
+# Specify volcano scheduler
+--conf spark.kubernetes.scheduler.name=volcano
+# Specify driver/executor VolcanoFeatureStep
+--conf spark.kubernetes.driver.pod.featureSteps=org.apache.spark.deploy.k8s.features.VolcanoFeatureStep
+--conf spark.kubernetes.executor.pod.featureSteps=org.apache.spark.deploy.k8s.features.VolcanoFeatureStep
+# Specify PodGroup template
+--conf spark.kubernetes.scheduler.volcano.podGroupTemplateFile=/path/to/podgroup-template.yaml
+```
+
+##### Volcano Feature Step
+Volcano feature steps help users to create Volcano PodGroup and set driver/executor pod annotation to link this PodGroup.

Review comment:
       ```suggestion
   Volcano feature steps help users to create a Volcano PodGroup and set driver/executor pod annotation to link with this PodGroup.
   ```

##########
File path: docs/running-on-kubernetes.md
##########
@@ -1722,6 +1722,83 @@ spec:
     image: will-be-overwritten
 ```
 
+#### Using Volcano as Customized Scheduler for Spark on Kubernetes
+
+##### Prerequisites
+* Volcano supports Spark on Kubernetes since v1.5. Mini version: v1.5.1+. See also [Volcano installation](https://volcano.sh/en/docs/installation).
+
+##### Usage
+Spark on Kubernetes allows using Volcano as a customized scheduler. Users can use Volcano to
+support more advanced resource scheduling: queue scheduling, resource reservation, priority scheduling, for example:
+
+```
+# Specify volcano scheduler
+--conf spark.kubernetes.scheduler.name=volcano
+# Specify driver/executor VolcanoFeatureStep
+--conf spark.kubernetes.driver.pod.featureSteps=org.apache.spark.deploy.k8s.features.VolcanoFeatureStep
+--conf spark.kubernetes.executor.pod.featureSteps=org.apache.spark.deploy.k8s.features.VolcanoFeatureStep
+# Specify PodGroup template
+--conf spark.kubernetes.scheduler.volcano.podGroupTemplateFile=/path/to/podgroup-template.yaml
+```
+
+##### Volcano Feature Step
+Volcano feature steps help users to create Volcano PodGroup and set driver/executor pod annotation to link this PodGroup.
+
+Note that, currently only supported driver/job level PodGroup in Volcano Feature Step, executor separate PodGroup is not supported yet.

Review comment:
       ```suggestion
   Note that currently only driver/job level PodGroup is supported in Volcano Feature Step. Executor PodGroup is not supported yet.
   ```

##########
File path: docs/running-on-kubernetes.md
##########
@@ -1722,6 +1722,83 @@ spec:
     image: will-be-overwritten
 ```
 
+#### Using Volcano as Customized Scheduler for Spark on Kubernetes
+
+##### Prerequisites
+* Volcano supports Spark on Kubernetes since v1.5. Mini version: v1.5.1+. See also [Volcano installation](https://volcano.sh/en/docs/installation).
+
+##### Usage
+Spark on Kubernetes allows using Volcano as a customized scheduler. Users can use Volcano to
+support more advanced resource scheduling: queue scheduling, resource reservation, priority scheduling, for example:
+
+```
+# Specify volcano scheduler
+--conf spark.kubernetes.scheduler.name=volcano
+# Specify driver/executor VolcanoFeatureStep
+--conf spark.kubernetes.driver.pod.featureSteps=org.apache.spark.deploy.k8s.features.VolcanoFeatureStep
+--conf spark.kubernetes.executor.pod.featureSteps=org.apache.spark.deploy.k8s.features.VolcanoFeatureStep
+# Specify PodGroup template
+--conf spark.kubernetes.scheduler.volcano.podGroupTemplateFile=/path/to/podgroup-template.yaml
+```
+
+##### Volcano Feature Step
+Volcano feature steps help users to create Volcano PodGroup and set driver/executor pod annotation to link this PodGroup.
+
+Note that, currently only supported driver/job level PodGroup in Volcano Feature Step, executor separate PodGroup is not supported yet.
+
+##### Volcano PodGroup Template
+Volcano defines PodGroup spec using [CRD yaml](https://volcano.sh/en/docs/podgroup/#example)
+
+Similar to [Pod template](#pod-template), Spark users can similarly use Volcano PodGroup Template to define the PodGroup spec configurations.
+
+To do so, specify the spark properties `spark.kubernetes.scheduler.volcano.podGroupTemplateFile` to point to files accessible to the `spark-submit` process.
+
+Below is an example of PodGroup template, see also [PodGroup Introduction](https://volcano.sh/en/docs/podgroup/#introduction):
+
+```
+apiVersion: scheduling.volcano.sh/v1beta1
+kind: PodGroup
+spec:
+  # Specify minMember to 1 to make driver
+  minMember: 1
+  # Specify minResources to support resource reservation
+  minResources:
+    cpu: "2"
+    memory: "3Gi"
+  # Specify the priority
+  priorityClassName: high-priority
+  queue: default
+```
+
+##### Features
+<table class="table">
+<tr><th>Scheduling</th><th>Description</th><th>Configuration</th></tr>
+<tr>
+  <td>Queue scheduling</td>
+  <td>
+    Queue indicates the resource queue, which adopts FIFO. is also used as the basis for resource division.
+    help users specify which queue the job to submit.
+  </td>
+  <td>`spec.queue` field in PodGroup template</td>
+</tr>
+<tr>
+  <td>Resource reservation</td>
+  <td>
+    Resource reservation, aka `Gang` scheduling (start all or nothing), helps users reserve resources for specific jobs.
+    It's useful for ensuring resource are meet the minimum requirements of spark job and avoiding all drivers stuck

Review comment:
       ```suggestion
       It's useful for ensuring resource are meet the minimum requirements of Spark job and avoiding all drivers stuck
   ```

##########
File path: docs/running-on-kubernetes.md
##########
@@ -1722,6 +1722,83 @@ spec:
     image: will-be-overwritten
 ```
 
+#### Using Volcano as Customized Scheduler for Spark on Kubernetes
+
+##### Prerequisites
+* Volcano supports Spark on Kubernetes since v1.5. Mini version: v1.5.1+. See also [Volcano installation](https://volcano.sh/en/docs/installation).
+
+##### Usage
+Spark on Kubernetes allows using Volcano as a customized scheduler. Users can use Volcano to
+support more advanced resource scheduling: queue scheduling, resource reservation, priority scheduling, for example:
+
+```
+# Specify volcano scheduler
+--conf spark.kubernetes.scheduler.name=volcano
+# Specify driver/executor VolcanoFeatureStep
+--conf spark.kubernetes.driver.pod.featureSteps=org.apache.spark.deploy.k8s.features.VolcanoFeatureStep
+--conf spark.kubernetes.executor.pod.featureSteps=org.apache.spark.deploy.k8s.features.VolcanoFeatureStep
+# Specify PodGroup template
+--conf spark.kubernetes.scheduler.volcano.podGroupTemplateFile=/path/to/podgroup-template.yaml
+```
+
+##### Volcano Feature Step
+Volcano feature steps help users to create Volcano PodGroup and set driver/executor pod annotation to link this PodGroup.
+
+Note that, currently only supported driver/job level PodGroup in Volcano Feature Step, executor separate PodGroup is not supported yet.
+
+##### Volcano PodGroup Template
+Volcano defines PodGroup spec using [CRD yaml](https://volcano.sh/en/docs/podgroup/#example)
+
+Similar to [Pod template](#pod-template), Spark users can similarly use Volcano PodGroup Template to define the PodGroup spec configurations.
+
+To do so, specify the spark properties `spark.kubernetes.scheduler.volcano.podGroupTemplateFile` to point to files accessible to the `spark-submit` process.
+
+Below is an example of PodGroup template, see also [PodGroup Introduction](https://volcano.sh/en/docs/podgroup/#introduction):
+
+```
+apiVersion: scheduling.volcano.sh/v1beta1
+kind: PodGroup
+spec:
+  # Specify minMember to 1 to make driver
+  minMember: 1
+  # Specify minResources to support resource reservation
+  minResources:
+    cpu: "2"
+    memory: "3Gi"
+  # Specify the priority
+  priorityClassName: high-priority
+  queue: default
+```
+
+##### Features
+<table class="table">
+<tr><th>Scheduling</th><th>Description</th><th>Configuration</th></tr>
+<tr>
+  <td>Queue scheduling</td>
+  <td>
+    Queue indicates the resource queue, which adopts FIFO. is also used as the basis for resource division.
+    help users specify which queue the job to submit.

Review comment:
       ```suggestion
       Helps the user to specify to which queue the job should be submitted to.
   ```

##########
File path: docs/running-on-kubernetes.md
##########
@@ -1722,6 +1722,83 @@ spec:
     image: will-be-overwritten
 ```
 
+#### Using Volcano as Customized Scheduler for Spark on Kubernetes
+
+##### Prerequisites
+* Volcano supports Spark on Kubernetes since v1.5. Mini version: v1.5.1+. See also [Volcano installation](https://volcano.sh/en/docs/installation).
+
+##### Usage
+Spark on Kubernetes allows using Volcano as a customized scheduler. Users can use Volcano to
+support more advanced resource scheduling: queue scheduling, resource reservation, priority scheduling, for example:
+

Review comment:
       + To use Volcano as a scheduler the user needs to specify the following configuration options:

##########
File path: docs/running-on-kubernetes.md
##########
@@ -1722,6 +1722,83 @@ spec:
     image: will-be-overwritten
 ```
 
+#### Using Volcano as Customized Scheduler for Spark on Kubernetes
+
+##### Prerequisites
+* Volcano supports Spark on Kubernetes since v1.5. Mini version: v1.5.1+. See also [Volcano installation](https://volcano.sh/en/docs/installation).
+
+##### Usage
+Spark on Kubernetes allows using Volcano as a customized scheduler. Users can use Volcano to
+support more advanced resource scheduling: queue scheduling, resource reservation, priority scheduling, for example:
+
+```
+# Specify volcano scheduler
+--conf spark.kubernetes.scheduler.name=volcano
+# Specify driver/executor VolcanoFeatureStep
+--conf spark.kubernetes.driver.pod.featureSteps=org.apache.spark.deploy.k8s.features.VolcanoFeatureStep
+--conf spark.kubernetes.executor.pod.featureSteps=org.apache.spark.deploy.k8s.features.VolcanoFeatureStep
+# Specify PodGroup template
+--conf spark.kubernetes.scheduler.volcano.podGroupTemplateFile=/path/to/podgroup-template.yaml
+```
+
+##### Volcano Feature Step
+Volcano feature steps help users to create Volcano PodGroup and set driver/executor pod annotation to link this PodGroup.
+
+Note that, currently only supported driver/job level PodGroup in Volcano Feature Step, executor separate PodGroup is not supported yet.
+
+##### Volcano PodGroup Template
+Volcano defines PodGroup spec using [CRD yaml](https://volcano.sh/en/docs/podgroup/#example)
+
+Similar to [Pod template](#pod-template), Spark users can similarly use Volcano PodGroup Template to define the PodGroup spec configurations.
+
+To do so, specify the spark properties `spark.kubernetes.scheduler.volcano.podGroupTemplateFile` to point to files accessible to the `spark-submit` process.
+
+Below is an example of PodGroup template, see also [PodGroup Introduction](https://volcano.sh/en/docs/podgroup/#introduction):
+
+```
+apiVersion: scheduling.volcano.sh/v1beta1
+kind: PodGroup
+spec:
+  # Specify minMember to 1 to make driver
+  minMember: 1
+  # Specify minResources to support resource reservation
+  minResources:
+    cpu: "2"
+    memory: "3Gi"
+  # Specify the priority
+  priorityClassName: high-priority
+  queue: default
+```
+
+##### Features
+<table class="table">
+<tr><th>Scheduling</th><th>Description</th><th>Configuration</th></tr>
+<tr>
+  <td>Queue scheduling</td>
+  <td>
+    Queue indicates the resource queue, which adopts FIFO. is also used as the basis for resource division.

Review comment:
       ```suggestion
       Queue indicates the resource queue, which adopts FIFO. It is also used as the basis for resource division.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org