You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/08/25 06:02:24 UTC

[GitHub] [spark] yangwwei opened a new pull request, #37622: [SPARK-40187][DOCS] Add `Apache YuniKorn` scheduler docs

yangwwei opened a new pull request, #37622:
URL: https://github.com/apache/spark/pull/37622

   ### What changes were proposed in this pull request?
   Add a section under [customized-kubernetes-schedulers-for-spark-on-kubernetes](https://spark.apache.org/docs/latest/running-on-kubernetes.html#customized-kubernetes-schedulers-for-spark-on-kubernetes) to explain how to run Spark with Apache YuniKorn. This is based on the review comments from #35663.
   
   
   ### Why are the changes needed?
   Explain how to run Spark with Apache YuniKorn
   
   ### Does this PR introduce _any_ user-facing change?
   No
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] yangwwei commented on a diff in pull request #37622: [SPARK-40187][DOCS] Add `Apache YuniKorn` scheduler docs

Posted by GitBox <gi...@apache.org>.
yangwwei commented on code in PR #37622:
URL: https://github.com/apache/spark/pull/37622#discussion_r952954104


##########
docs/running-on-kubernetes.md:
##########
@@ -1811,6 +1811,50 @@ spec:
   queue: default
 ```
 
+#### Using Apache YuniKorn as Customized Scheduler for Spark on Kubernetes
+
+[Apache YuniKorn](https://yunikorn.apache.org/) is a resource scheduler for Kubernetes that provides advanced batch scheduling
+capabilities, such as job queuing, resource fairness, min/max queue capacity and flexible job ordering policies.
+For available Apache YuniKorn features, please refer to [this doc](https://yunikorn.apache.org/docs/next/get_started/core_features).

Review Comment:
   hi @dongjoon-hyun This is how the doc site works, next -> is the current under-development version, we shouldn't use this, that's a good point; but I think we can use the latest stable version: this points to https://yunikorn.apache.org/docs/. Only the past versions are accessible via https://yunikorn.apache.org/docs/{VERSION_NUM}, that's why you did not see 1.0.0 there, 1.0.0 is the current stable version.
   
   If we use a hard-coded version, e.g 1.0.0 here, we will need to come back to update the doc quite often, I don't feel that is good. So my question is: is it better to use the latest stable version here or a hard-coded version that will need updates over time? Please let me know, thanks!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #37622: [SPARK-40187][DOCS] Add `Apache YuniKorn` scheduler docs

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on code in PR #37622:
URL: https://github.com/apache/spark/pull/37622#discussion_r952872525


##########
docs/running-on-kubernetes.md:
##########
@@ -1811,6 +1811,50 @@ spec:
   queue: default
 ```
 
+#### Using Apache YuniKorn as Customized Scheduler for Spark on Kubernetes
+
+[Apache YuniKorn](https://yunikorn.apache.org/) is a resource scheduler for Kubernetes that provides advanced batch scheduling
+capabilities, such as job queuing, resource fairness, min/max queue capacity and flexible job ordering policies.
+For available Apache YuniKorn features, please refer to [this doc](https://yunikorn.apache.org/docs/next/get_started/core_features).
+
+##### Prerequisites
+
+Install Apache YuniKorn:
+
+```bash
+helm repo add yunikorn https://apache.github.io/yunikorn-release
+helm repo update
+kubectl create namespace yunikorn
+helm install yunikorn yunikorn/yunikorn --namespace yunikorn
+```
+
+the above steps will install the latest version of YuniKorn on an existing Kubernetes cluster.
+
+##### Get started
+
+Submit Spark jobs with the following extra options:
+
+```bash
+--conf spark.kubernetes.scheduler.name=yunikorn
+--conf spark.kubernetes.driver.annotation.yunikorn.apache.org/app-id={{APP_ID}}
+--conf spark.kubernetes.executor.annotation.yunikorn.apache.org/app-id={{APP_ID}}
+```
+
+Note, `{{APP_ID}}` is the builtin variable that will be substituted with Spark job ID automatically.
+With the above configuration, the job will be scheduled by YuniKorn scheduler instead of the default Kubernetes scheduler.
+
+##### Work with YuniKorn queues
+
+Apache YuniKorn supports 2 types of resource queues:
+
+- Static
+- Dynamic
+
+The static queues are predefined in YuniKorn configmap, and the dynamic queues are automatically created by the scheduler
+based on [placement rules](https://yunikorn.apache.org/docs/next/user_guide/placement_rules). Spark supports to run with
+both queue setup. Refer to this [doc](https://yunikorn.apache.org/docs/next/user_guide/resource_quota_management) for more
+information about how to run Spark with different queue setup.
+

Review Comment:
   Lastly, is there any limitation of YuniKorn? Specifically, Apache Spark
   - Supports ARM64 arch like Graviton2
   - Supports IPv6-only environment via https://kubernetes.io/docs/concepts/services-networking/dual-stack/
   
   I'm wondering if there is some documentation on YuniKorn website about those environment.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] yangwwei commented on a diff in pull request #37622: [SPARK-40187][DOCS] Add `Apache YuniKorn` scheduler docs

Posted by GitBox <gi...@apache.org>.
yangwwei commented on code in PR #37622:
URL: https://github.com/apache/spark/pull/37622#discussion_r952962802


##########
docs/running-on-kubernetes.md:
##########
@@ -1811,6 +1811,50 @@ spec:
   queue: default
 ```
 
+#### Using Apache YuniKorn as Customized Scheduler for Spark on Kubernetes
+
+[Apache YuniKorn](https://yunikorn.apache.org/) is a resource scheduler for Kubernetes that provides advanced batch scheduling
+capabilities, such as job queuing, resource fairness, min/max queue capacity and flexible job ordering policies.
+For available Apache YuniKorn features, please refer to [this doc](https://yunikorn.apache.org/docs/next/get_started/core_features).
+
+##### Prerequisites
+
+Install Apache YuniKorn:
+
+```bash
+helm repo add yunikorn https://apache.github.io/yunikorn-release
+helm repo update
+kubectl create namespace yunikorn
+helm install yunikorn yunikorn/yunikorn --namespace yunikorn
+```
+
+the above steps will install the latest version of YuniKorn on an existing Kubernetes cluster.
+
+##### Get started
+
+Submit Spark jobs with the following extra options:
+
+```bash
+--conf spark.kubernetes.scheduler.name=yunikorn
+--conf spark.kubernetes.driver.annotation.yunikorn.apache.org/app-id={{APP_ID}}
+--conf spark.kubernetes.executor.annotation.yunikorn.apache.org/app-id={{APP_ID}}
+```
+
+Note, `{{APP_ID}}` is the builtin variable that will be substituted with Spark job ID automatically.
+With the above configuration, the job will be scheduled by YuniKorn scheduler instead of the default Kubernetes scheduler.
+
+##### Work with YuniKorn queues
+
+Apache YuniKorn supports 2 types of resource queues:
+
+- Static
+- Dynamic
+
+The static queues are predefined in YuniKorn configmap, and the dynamic queues are automatically created by the scheduler
+based on [placement rules](https://yunikorn.apache.org/docs/next/user_guide/placement_rules). Spark supports to run with
+both queue setup. Refer to this [doc](https://yunikorn.apache.org/docs/next/user_guide/resource_quota_management) for more
+information about how to run Spark with different queue setup.
+

Review Comment:
   Good point, YuniKorn as of today doesn't have published docker images for arm64. Users can only build from the source.  This is a limitation, let me add some doc for this. There is no issue to support IPv6. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #37622: [SPARK-40187][DOCS] Add Apache YuniKorn scheduler docs

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on code in PR #37622:
URL: https://github.com/apache/spark/pull/37622#discussion_r952739068


##########
docs/running-on-kubernetes.md:
##########
@@ -1811,6 +1811,50 @@ spec:
   queue: default
 ```
 
+#### Using Apache YuniKorn as Customized Scheduler for Spark on Kubernetes
+
+[Apache YuniKorn](https://yunikorn.apache.org/) is a resource scheduler for Kubernetes that provides advanced batch scheduling
+capabilities, such as job queuing, resource fairness, min/max queue capacity and flexible job ordering policies.
+For available Apache YuniKorn features, please refer to [this doc](https://yunikorn.apache.org/docs/next/get_started/core_features).

Review Comment:
   We cannot use the version number alias `next` here because it will be fragile in the future.
   - We need to use a concrete version `v1.0.0` link instead.
   - However, Apache YuniKorn doesn't provide `1.0.0` yet.
     - https://yunikorn.apache.org/docs/1.0.0/ is broken.
     - The only latest version seems to be https://yunikorn.apache.org/docs/0.12.2/ .
   
   We need Apache YuniKorn community's help here, @yangwwei .



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #37622: [SPARK-40187][DOCS] Add `Apache YuniKorn` scheduler docs

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on PR #37622:
URL: https://github.com/apache/spark/pull/37622#issuecomment-1234034066

   I created a test suite PR, @yangwwei .
   - https://github.com/apache/spark/pull/37753


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #37622: [SPARK-40187][DOCS] Add Apache YuniKorn scheduler docs

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on code in PR #37622:
URL: https://github.com/apache/spark/pull/37622#discussion_r952797580


##########
docs/running-on-kubernetes.md:
##########
@@ -1811,6 +1811,50 @@ spec:
   queue: default
 ```
 
+#### Using Apache YuniKorn as Customized Scheduler for Spark on Kubernetes
+
+[Apache YuniKorn](https://yunikorn.apache.org/) is a resource scheduler for Kubernetes that provides advanced batch scheduling
+capabilities, such as job queuing, resource fairness, min/max queue capacity and flexible job ordering policies.
+For available Apache YuniKorn features, please refer to [this doc](https://yunikorn.apache.org/docs/next/get_started/core_features).
+
+##### Prerequisites
+
+Install Apache YuniKorn:
+
+```bash
+helm repo add yunikorn https://apache.github.io/yunikorn-release
+helm repo update
+kubectl create namespace yunikorn
+helm install yunikorn yunikorn/yunikorn --namespace yunikorn
+```
+
+the above steps will install the latest version of YuniKorn on an existing Kubernetes cluster.

Review Comment:
   `the above` -> `The above`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] yangwwei commented on a diff in pull request #37622: [SPARK-40187][DOCS] Add `Apache YuniKorn` scheduler docs

Posted by GitBox <gi...@apache.org>.
yangwwei commented on code in PR #37622:
URL: https://github.com/apache/spark/pull/37622#discussion_r952960794


##########
docs/running-on-kubernetes.md:
##########
@@ -1811,6 +1811,50 @@ spec:
   queue: default
 ```
 
+#### Using Apache YuniKorn as Customized Scheduler for Spark on Kubernetes
+
+[Apache YuniKorn](https://yunikorn.apache.org/) is a resource scheduler for Kubernetes that provides advanced batch scheduling
+capabilities, such as job queuing, resource fairness, min/max queue capacity and flexible job ordering policies.
+For available Apache YuniKorn features, please refer to [this doc](https://yunikorn.apache.org/docs/next/get_started/core_features).
+
+##### Prerequisites
+
+Install Apache YuniKorn:
+
+```bash
+helm repo add yunikorn https://apache.github.io/yunikorn-release
+helm repo update
+kubectl create namespace yunikorn
+helm install yunikorn yunikorn/yunikorn --namespace yunikorn
+```
+
+the above steps will install the latest version of YuniKorn on an existing Kubernetes cluster.
+
+##### Get started
+
+Submit Spark jobs with the following extra options:
+
+```bash
+--conf spark.kubernetes.scheduler.name=yunikorn
+--conf spark.kubernetes.driver.annotation.yunikorn.apache.org/app-id={{APP_ID}}
+--conf spark.kubernetes.executor.annotation.yunikorn.apache.org/app-id={{APP_ID}}
+```
+
+Note, `{{APP_ID}}` is the builtin variable that will be substituted with Spark job ID automatically.
+With the above configuration, the job will be scheduled by YuniKorn scheduler instead of the default Kubernetes scheduler.
+
+##### Work with YuniKorn queues
+
+Apache YuniKorn supports 2 types of resource queues:
+
+- Static
+- Dynamic
+
+The static queues are predefined in YuniKorn configmap, and the dynamic queues are automatically created by the scheduler
+based on [placement rules](https://yunikorn.apache.org/docs/next/user_guide/placement_rules). Spark supports to run with

Review Comment:
   Fixed to the current version



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] yangwwei commented on pull request #37622: [SPARK-40187][DOCS] Add `Apache YuniKorn` scheduler docs

Posted by GitBox <gi...@apache.org>.
yangwwei commented on PR #37622:
URL: https://github.com/apache/spark/pull/37622#issuecomment-1232299398

   @dongjoon-hyun circle back on this. does the latest version look good to you?
   Anything else you want me to address in this doc? Please let me know, thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] yangwwei commented on pull request #37622: [SPARK-40187][DOCS] Add `Apache YuniKorn` scheduler docs

Posted by GitBox <gi...@apache.org>.
yangwwei commented on PR #37622:
URL: https://github.com/apache/spark/pull/37622#issuecomment-1227634279

   > According to [apache/yunikorn-site#180 (comment)](https://github.com/apache/yunikorn-site/pull/180#issuecomment-1226820265) , I understand the context (including rollback) and let's hold on this until next week (YuniKorn v1.1).
   
   Please take a look at the updated version. I do not think we depend on the 1.0.0 doc link anymore. We just make sure we explicitly set the version in the installation example. For general docs, we can point to  https://yunikorn.apache.org.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] yangwwei commented on a diff in pull request #37622: [SPARK-40187][DOCS] Add `Apache YuniKorn` scheduler docs

Posted by GitBox <gi...@apache.org>.
yangwwei commented on code in PR #37622:
URL: https://github.com/apache/spark/pull/37622#discussion_r952959912


##########
docs/running-on-kubernetes.md:
##########
@@ -1811,6 +1811,50 @@ spec:
   queue: default
 ```
 
+#### Using Apache YuniKorn as Customized Scheduler for Spark on Kubernetes
+
+[Apache YuniKorn](https://yunikorn.apache.org/) is a resource scheduler for Kubernetes that provides advanced batch scheduling
+capabilities, such as job queuing, resource fairness, min/max queue capacity and flexible job ordering policies.
+For available Apache YuniKorn features, please refer to [this doc](https://yunikorn.apache.org/docs/next/get_started/core_features).
+
+##### Prerequisites
+
+Install Apache YuniKorn:
+
+```bash
+helm repo add yunikorn https://apache.github.io/yunikorn-release
+helm repo update
+kubectl create namespace yunikorn
+helm install yunikorn yunikorn/yunikorn --namespace yunikorn

Review Comment:
   Good catch, let me fix that.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #37622: [SPARK-40187][DOCS] Add `Apache YuniKorn` scheduler docs

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on code in PR #37622:
URL: https://github.com/apache/spark/pull/37622#discussion_r953077805


##########
docs/running-on-kubernetes.md:
##########
@@ -1811,6 +1811,50 @@ spec:
   queue: default
 ```
 
+#### Using Apache YuniKorn as Customized Scheduler for Spark on Kubernetes
+
+[Apache YuniKorn](https://yunikorn.apache.org/) is a resource scheduler for Kubernetes that provides advanced batch scheduling
+capabilities, such as job queuing, resource fairness, min/max queue capacity and flexible job ordering policies.
+For available Apache YuniKorn features, please refer to [this doc](https://yunikorn.apache.org/docs/next/get_started/core_features).
+
+##### Prerequisites
+
+Install Apache YuniKorn:
+
+```bash
+helm repo add yunikorn https://apache.github.io/yunikorn-release
+helm repo update
+kubectl create namespace yunikorn
+helm install yunikorn yunikorn/yunikorn --namespace yunikorn
+```
+
+the above steps will install the latest version of YuniKorn on an existing Kubernetes cluster.
+
+##### Get started
+
+Submit Spark jobs with the following extra options:
+
+```bash
+--conf spark.kubernetes.scheduler.name=yunikorn
+--conf spark.kubernetes.driver.annotation.yunikorn.apache.org/app-id={{APP_ID}}
+--conf spark.kubernetes.executor.annotation.yunikorn.apache.org/app-id={{APP_ID}}
+```
+
+Note, `{{APP_ID}}` is the builtin variable that will be substituted with Spark job ID automatically.
+With the above configuration, the job will be scheduled by YuniKorn scheduler instead of the default Kubernetes scheduler.
+
+##### Work with YuniKorn queues

Review Comment:
   If this is all you can add, we had better remove this section, `Work with YuniKorn queues`,  because there is nothing for user to do.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #37622: [SPARK-40187][DOCS] Add `Apache YuniKorn` scheduler docs

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on code in PR #37622:
URL: https://github.com/apache/spark/pull/37622#discussion_r952877147


##########
docs/running-on-kubernetes.md:
##########
@@ -1811,6 +1811,50 @@ spec:
   queue: default
 ```
 
+#### Using Apache YuniKorn as Customized Scheduler for Spark on Kubernetes
+
+[Apache YuniKorn](https://yunikorn.apache.org/) is a resource scheduler for Kubernetes that provides advanced batch scheduling
+capabilities, such as job queuing, resource fairness, min/max queue capacity and flexible job ordering policies.
+For available Apache YuniKorn features, please refer to [this doc](https://yunikorn.apache.org/docs/next/get_started/core_features).
+
+##### Prerequisites
+
+Install Apache YuniKorn:
+
+```bash
+helm repo add yunikorn https://apache.github.io/yunikorn-release
+helm repo update
+kubectl create namespace yunikorn
+helm install yunikorn yunikorn/yunikorn --namespace yunikorn

Review Comment:
   BTW, IIRC, Apache YuniKorn is now TLP project, not an incubation project.
   - https://apache.github.io/yunikorn-release
   
   Could you fix the following content in Apache YuniKorn link please?
   
   <img width="844" alt="Screen Shot 2022-08-23 at 9 45 07 AM" src="https://user-images.githubusercontent.com/9700541/186215061-58bbefbd-90ae-4bad-897e-bb51581246c6.png">



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun closed pull request #37622: [SPARK-40187][DOCS] Add `Apache YuniKorn` scheduler docs

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun closed pull request #37622: [SPARK-40187][DOCS] Add `Apache YuniKorn` scheduler docs
URL: https://github.com/apache/spark/pull/37622


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] yangwwei commented on a diff in pull request #37622: [SPARK-40187][DOCS] Add `Apache YuniKorn` scheduler docs

Posted by GitBox <gi...@apache.org>.
yangwwei commented on code in PR #37622:
URL: https://github.com/apache/spark/pull/37622#discussion_r952964187


##########
docs/running-on-kubernetes.md:
##########
@@ -1811,6 +1811,50 @@ spec:
   queue: default
 ```
 
+#### Using Apache YuniKorn as Customized Scheduler for Spark on Kubernetes
+
+[Apache YuniKorn](https://yunikorn.apache.org/) is a resource scheduler for Kubernetes that provides advanced batch scheduling
+capabilities, such as job queuing, resource fairness, min/max queue capacity and flexible job ordering policies.
+For available Apache YuniKorn features, please refer to [this doc](https://yunikorn.apache.org/docs/next/get_started/core_features).
+
+##### Prerequisites
+
+Install Apache YuniKorn:
+
+```bash
+helm repo add yunikorn https://apache.github.io/yunikorn-release
+helm repo update
+kubectl create namespace yunikorn
+helm install yunikorn yunikorn/yunikorn --namespace yunikorn
+```
+
+the above steps will install the latest version of YuniKorn on an existing Kubernetes cluster.
+
+##### Get started
+
+Submit Spark jobs with the following extra options:
+
+```bash
+--conf spark.kubernetes.scheduler.name=yunikorn
+--conf spark.kubernetes.driver.annotation.yunikorn.apache.org/app-id={{APP_ID}}
+--conf spark.kubernetes.executor.annotation.yunikorn.apache.org/app-id={{APP_ID}}
+```
+
+Note, `{{APP_ID}}` is the builtin variable that will be substituted with Spark job ID automatically.
+With the above configuration, the job will be scheduled by YuniKorn scheduler instead of the default Kubernetes scheduler.
+
+##### Work with YuniKorn queues

Review Comment:
   Could you please share what kind of info I can add to make this more sufficient? 
   I am trying to find a balance here, I do not want to make the doc tedious and hard to read, but still want to make sure enough info is provided.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #37622: [SPARK-40187][DOCS] Add `Apache YuniKorn` scheduler docs

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on code in PR #37622:
URL: https://github.com/apache/spark/pull/37622#discussion_r952956382


##########
docs/running-on-kubernetes.md:
##########
@@ -1811,6 +1811,50 @@ spec:
   queue: default
 ```
 
+#### Using Apache YuniKorn as Customized Scheduler for Spark on Kubernetes
+
+[Apache YuniKorn](https://yunikorn.apache.org/) is a resource scheduler for Kubernetes that provides advanced batch scheduling
+capabilities, such as job queuing, resource fairness, min/max queue capacity and flexible job ordering policies.
+For available Apache YuniKorn features, please refer to [this doc](https://yunikorn.apache.org/docs/next/get_started/core_features).

Review Comment:
   That's what I asked here. Apache YuniKorn community should provide 1.0.0 like Apache Spark did.
   - https://spark.apache.org/docs/3.3.0/
   
   That is mandatory in order to guarantee when we support something. Please see Volcano example.
   > we will need to come back to update the doc quite often,



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #37622: [SPARK-40187][DOCS] Add `Apache YuniKorn` scheduler docs

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on code in PR #37622:
URL: https://github.com/apache/spark/pull/37622#discussion_r952850544


##########
docs/running-on-kubernetes.md:
##########
@@ -1811,6 +1811,50 @@ spec:
   queue: default
 ```
 
+#### Using Apache YuniKorn as Customized Scheduler for Spark on Kubernetes
+
+[Apache YuniKorn](https://yunikorn.apache.org/) is a resource scheduler for Kubernetes that provides advanced batch scheduling
+capabilities, such as job queuing, resource fairness, min/max queue capacity and flexible job ordering policies.
+For available Apache YuniKorn features, please refer to [this doc](https://yunikorn.apache.org/docs/next/get_started/core_features).
+
+##### Prerequisites
+
+Install Apache YuniKorn:
+
+```bash
+helm repo add yunikorn https://apache.github.io/yunikorn-release
+helm repo update
+kubectl create namespace yunikorn
+helm install yunikorn yunikorn/yunikorn --namespace yunikorn
+```
+
+the above steps will install the latest version of YuniKorn on an existing Kubernetes cluster.
+
+##### Get started
+
+Submit Spark jobs with the following extra options:
+
+```bash
+--conf spark.kubernetes.scheduler.name=yunikorn
+--conf spark.kubernetes.driver.annotation.yunikorn.apache.org/app-id={{APP_ID}}
+--conf spark.kubernetes.executor.annotation.yunikorn.apache.org/app-id={{APP_ID}}
+```
+
+Note, `{{APP_ID}}` is the builtin variable that will be substituted with Spark job ID automatically.

Review Comment:
   nit.
   - `Note,` -> `Note that`
   - `builtin` -> `built-in`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] yangwwei commented on a diff in pull request #37622: [SPARK-40187][DOCS] Add `Apache YuniKorn` scheduler docs

Posted by GitBox <gi...@apache.org>.
yangwwei commented on code in PR #37622:
URL: https://github.com/apache/spark/pull/37622#discussion_r953051876


##########
docs/running-on-kubernetes.md:
##########
@@ -1811,6 +1811,50 @@ spec:
   queue: default
 ```
 
+#### Using Apache YuniKorn as Customized Scheduler for Spark on Kubernetes
+
+[Apache YuniKorn](https://yunikorn.apache.org/) is a resource scheduler for Kubernetes that provides advanced batch scheduling
+capabilities, such as job queuing, resource fairness, min/max queue capacity and flexible job ordering policies.
+For available Apache YuniKorn features, please refer to [this doc](https://yunikorn.apache.org/docs/next/get_started/core_features).
+
+##### Prerequisites
+
+Install Apache YuniKorn:
+
+```bash
+helm repo add yunikorn https://apache.github.io/yunikorn-release
+helm repo update
+kubectl create namespace yunikorn
+helm install yunikorn yunikorn/yunikorn --namespace yunikorn

Review Comment:
   this page https://apache.github.io/yunikorn-release has been fixed.
   
   > It seems that YuniKorn helm chart doesn't have TEST SUITE.
   
   Yes, there is no test suite today in the helm charts.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #37622: [SPARK-40187][DOCS] Add `Apache YuniKorn` scheduler docs

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on PR #37622:
URL: https://github.com/apache/spark/pull/37622#issuecomment-1224372281

   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] yangwwei closed pull request #37622: [SPARK-40187][DOCS] Add `Apache YuniKorn` scheduler docs

Posted by GitBox <gi...@apache.org>.
yangwwei closed pull request #37622: [SPARK-40187][DOCS] Add `Apache YuniKorn` scheduler docs
URL: https://github.com/apache/spark/pull/37622


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #37622: [SPARK-40187][DOCS] Add `Apache YuniKorn` scheduler docs

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on code in PR #37622:
URL: https://github.com/apache/spark/pull/37622#discussion_r952877147


##########
docs/running-on-kubernetes.md:
##########
@@ -1811,6 +1811,50 @@ spec:
   queue: default
 ```
 
+#### Using Apache YuniKorn as Customized Scheduler for Spark on Kubernetes
+
+[Apache YuniKorn](https://yunikorn.apache.org/) is a resource scheduler for Kubernetes that provides advanced batch scheduling
+capabilities, such as job queuing, resource fairness, min/max queue capacity and flexible job ordering policies.
+For available Apache YuniKorn features, please refer to [this doc](https://yunikorn.apache.org/docs/next/get_started/core_features).
+
+##### Prerequisites
+
+Install Apache YuniKorn:
+
+```bash
+helm repo add yunikorn https://apache.github.io/yunikorn-release
+helm repo update
+kubectl create namespace yunikorn
+helm install yunikorn yunikorn/yunikorn --namespace yunikorn

Review Comment:
   BTW, IIRC, Apache YuniKorn is now TLP project, not an incubation project.
   - https://apache.github.io/yunikorn-release
   
   Could you update Apache YuniKorn link please?
   
   <img width="844" alt="Screen Shot 2022-08-23 at 9 45 07 AM" src="https://user-images.githubusercontent.com/9700541/186215061-58bbefbd-90ae-4bad-897e-bb51581246c6.png">
   /



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #37622: [SPARK-40187][DOCS] Add `Apache YuniKorn` scheduler docs

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on PR #37622:
URL: https://github.com/apache/spark/pull/37622#issuecomment-1227739668

   Got it. Let me think about that again from Apache Spark user perspective. I believe we can find some sweet spots where both communities satisfy, @yangwwei . Thank you again for all your contribution and collaboration.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #37622: [SPARK-40187][DOCS] Add `Apache YuniKorn` scheduler docs

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on PR #37622:
URL: https://github.com/apache/spark/pull/37622#issuecomment-1226843038

   According to https://github.com/apache/yunikorn-site/pull/180#issuecomment-1226820265 , I understand the context (including rollback) and let's hold on this until next week (YuniKorn v1.1).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #37622: [SPARK-40187][DOCS] Add Apache YuniKorn scheduler docs

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on code in PR #37622:
URL: https://github.com/apache/spark/pull/37622#discussion_r952744027


##########
docs/running-on-kubernetes.md:
##########
@@ -1811,6 +1811,50 @@ spec:
   queue: default
 ```
 
+#### Using Apache YuniKorn as Customized Scheduler for Spark on Kubernetes
+
+[Apache YuniKorn](https://yunikorn.apache.org/) is a resource scheduler for Kubernetes that provides advanced batch scheduling
+capabilities, such as job queuing, resource fairness, min/max queue capacity and flexible job ordering policies.
+For available Apache YuniKorn features, please refer to [this doc](https://yunikorn.apache.org/docs/next/get_started/core_features).
+
+##### Prerequisites
+
+Install Apache YuniKorn:
+
+```bash
+helm repo add yunikorn https://apache.github.io/yunikorn-release
+helm repo update
+kubectl create namespace yunikorn
+helm install yunikorn yunikorn/yunikorn --namespace yunikorn
+```
+
+the above steps will install the latest version of YuniKorn on an existing Kubernetes cluster.

Review Comment:
   Please use specific version instead recommending `the latest version` in the doc.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] yangwwei commented on pull request #37622: [SPARK-40187][DOCS] Add `Apache YuniKorn` scheduler docs

Posted by GitBox <gi...@apache.org>.
yangwwei commented on PR #37622:
URL: https://github.com/apache/spark/pull/37622#issuecomment-1234636614

   Hi, @dongjoon-hyun  thanks a lot for helping on this.
   This is a great community collaboration between YuniKorn and Spark, thank you so much!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] yangwwei commented on a diff in pull request #37622: [SPARK-40187][DOCS] Add `Apache YuniKorn` scheduler docs

Posted by GitBox <gi...@apache.org>.
yangwwei commented on code in PR #37622:
URL: https://github.com/apache/spark/pull/37622#discussion_r952954477


##########
docs/running-on-kubernetes.md:
##########
@@ -1811,6 +1811,50 @@ spec:
   queue: default
 ```
 
+#### Using Apache YuniKorn as Customized Scheduler for Spark on Kubernetes
+
+[Apache YuniKorn](https://yunikorn.apache.org/) is a resource scheduler for Kubernetes that provides advanced batch scheduling
+capabilities, such as job queuing, resource fairness, min/max queue capacity and flexible job ordering policies.
+For available Apache YuniKorn features, please refer to [this doc](https://yunikorn.apache.org/docs/next/get_started/core_features).
+
+##### Prerequisites

Review Comment:
   ACK



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #37622: [SPARK-40187][DOCS] Add `Apache YuniKorn` scheduler docs

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on code in PR #37622:
URL: https://github.com/apache/spark/pull/37622#discussion_r952872525


##########
docs/running-on-kubernetes.md:
##########
@@ -1811,6 +1811,50 @@ spec:
   queue: default
 ```
 
+#### Using Apache YuniKorn as Customized Scheduler for Spark on Kubernetes
+
+[Apache YuniKorn](https://yunikorn.apache.org/) is a resource scheduler for Kubernetes that provides advanced batch scheduling
+capabilities, such as job queuing, resource fairness, min/max queue capacity and flexible job ordering policies.
+For available Apache YuniKorn features, please refer to [this doc](https://yunikorn.apache.org/docs/next/get_started/core_features).
+
+##### Prerequisites
+
+Install Apache YuniKorn:
+
+```bash
+helm repo add yunikorn https://apache.github.io/yunikorn-release
+helm repo update
+kubectl create namespace yunikorn
+helm install yunikorn yunikorn/yunikorn --namespace yunikorn
+```
+
+the above steps will install the latest version of YuniKorn on an existing Kubernetes cluster.
+
+##### Get started
+
+Submit Spark jobs with the following extra options:
+
+```bash
+--conf spark.kubernetes.scheduler.name=yunikorn
+--conf spark.kubernetes.driver.annotation.yunikorn.apache.org/app-id={{APP_ID}}
+--conf spark.kubernetes.executor.annotation.yunikorn.apache.org/app-id={{APP_ID}}
+```
+
+Note, `{{APP_ID}}` is the builtin variable that will be substituted with Spark job ID automatically.
+With the above configuration, the job will be scheduled by YuniKorn scheduler instead of the default Kubernetes scheduler.
+
+##### Work with YuniKorn queues
+
+Apache YuniKorn supports 2 types of resource queues:
+
+- Static
+- Dynamic
+
+The static queues are predefined in YuniKorn configmap, and the dynamic queues are automatically created by the scheduler
+based on [placement rules](https://yunikorn.apache.org/docs/next/user_guide/placement_rules). Spark supports to run with
+both queue setup. Refer to this [doc](https://yunikorn.apache.org/docs/next/user_guide/resource_quota_management) for more
+information about how to run Spark with different queue setup.
+

Review Comment:
   Lastly, is there any limitation of YuniKorn? Can we assume that YuniKorn will work with all Spark-capable environment. Specifically, Apache Spark
   - Supports ARM64 arch like Graviton2
   - Supports IPv6-only environment via https://kubernetes.io/docs/concepts/services-networking/dual-stack/
   
   I'm wondering if there is some documentation on YuniKorn website about those environment.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] yangwwei commented on a diff in pull request #37622: [SPARK-40187][DOCS] Add `Apache YuniKorn` scheduler docs

Posted by GitBox <gi...@apache.org>.
yangwwei commented on code in PR #37622:
URL: https://github.com/apache/spark/pull/37622#discussion_r954534805


##########
docs/running-on-kubernetes.md:
##########
@@ -1811,6 +1811,50 @@ spec:
   queue: default
 ```
 
+#### Using Apache YuniKorn as Customized Scheduler for Spark on Kubernetes
+
+[Apache YuniKorn](https://yunikorn.apache.org/) is a resource scheduler for Kubernetes that provides advanced batch scheduling
+capabilities, such as job queuing, resource fairness, min/max queue capacity and flexible job ordering policies.
+For available Apache YuniKorn features, please refer to [this doc](https://yunikorn.apache.org/docs/next/get_started/core_features).
+
+##### Prerequisites

Review Comment:
   Explicitly added 1.0.0 version in the installation example



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #37622: [SPARK-40187][DOCS] Add `Apache YuniKorn` scheduler docs

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on PR #37622:
URL: https://github.com/apache/spark/pull/37622#issuecomment-1234811066

   Thank YOU, @yangwwei .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] yangwwei commented on a diff in pull request #37622: [SPARK-40187][DOCS] Add `Apache YuniKorn` scheduler docs

Posted by GitBox <gi...@apache.org>.
yangwwei commented on code in PR #37622:
URL: https://github.com/apache/spark/pull/37622#discussion_r952960313


##########
docs/running-on-kubernetes.md:
##########
@@ -1811,6 +1811,50 @@ spec:
   queue: default
 ```
 
+#### Using Apache YuniKorn as Customized Scheduler for Spark on Kubernetes
+
+[Apache YuniKorn](https://yunikorn.apache.org/) is a resource scheduler for Kubernetes that provides advanced batch scheduling
+capabilities, such as job queuing, resource fairness, min/max queue capacity and flexible job ordering policies.
+For available Apache YuniKorn features, please refer to [this doc](https://yunikorn.apache.org/docs/next/get_started/core_features).
+
+##### Prerequisites
+
+Install Apache YuniKorn:
+
+```bash
+helm repo add yunikorn https://apache.github.io/yunikorn-release
+helm repo update
+kubectl create namespace yunikorn
+helm install yunikorn yunikorn/yunikorn --namespace yunikorn
+```
+
+the above steps will install the latest version of YuniKorn on an existing Kubernetes cluster.
+
+##### Get started
+
+Submit Spark jobs with the following extra options:
+
+```bash
+--conf spark.kubernetes.scheduler.name=yunikorn
+--conf spark.kubernetes.driver.annotation.yunikorn.apache.org/app-id={{APP_ID}}
+--conf spark.kubernetes.executor.annotation.yunikorn.apache.org/app-id={{APP_ID}}
+```
+
+Note, `{{APP_ID}}` is the builtin variable that will be substituted with Spark job ID automatically.

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #37622: [SPARK-40187][DOCS] Add `Apache YuniKorn` scheduler docs

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on code in PR #37622:
URL: https://github.com/apache/spark/pull/37622#discussion_r954549867


##########
docs/running-on-kubernetes.md:
##########
@@ -1811,6 +1811,42 @@ spec:
   queue: default
 ```
 
+#### Using Apache YuniKorn as Customized Scheduler for Spark on Kubernetes
+
+[Apache YuniKorn](https://yunikorn.apache.org/) is a resource scheduler for Kubernetes that provides advanced batch scheduling
+capabilities, such as job queuing, resource fairness, min/max queue capacity and flexible job ordering policies.
+For available Apache YuniKorn features, please refer to [core features](https://yunikorn.apache.org/docs/get_started/core_features).
+
+##### Prerequisites
+
+Install Apache YuniKorn:
+
+```bash
+helm repo add yunikorn https://apache.github.io/yunikorn-release
+helm repo update
+kubectl create namespace yunikorn
+helm install yunikorn yunikorn/yunikorn --namespace yunikorn --version 1.0.0

Review Comment:
   Thank you for adding this. This is much better.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] yangwwei commented on pull request #37622: [SPARK-40187][DOCS] Add `Apache YuniKorn` scheduler docs

Posted by GitBox <gi...@apache.org>.
yangwwei commented on PR #37622:
URL: https://github.com/apache/spark/pull/37622#issuecomment-1227632338

   > In that case, why don't you add Apache Spark Support Matrix to Apache YuniKorn page?
   
   I don't think that will be necessary. Apache YuniKorn is a scheduler, a replacement to the default scheduler, it isn't so sensitive about Spark versions. Any Spark can run on YuniKorn with some necessary configs. In the recent version with the support of https://issues.apache.org/jira/browse/SPARK-38383, submitting jobs to YuniKorn is even easier (what was introduced in this doc).  This is like we do not need a Spark support matrix in YARN, or Kubernetes. 
   
   Adding such a matrix on the YuniKorn side is cumbersome, we probably will need to list all Spark versions, which isn't useful for the end users.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #37622: [SPARK-40187][DOCS] Add Apache YuniKorn scheduler docs

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on code in PR #37622:
URL: https://github.com/apache/spark/pull/37622#discussion_r952747634


##########
docs/running-on-kubernetes.md:
##########
@@ -1811,6 +1811,50 @@ spec:
   queue: default
 ```
 
+#### Using Apache YuniKorn as Customized Scheduler for Spark on Kubernetes
+
+[Apache YuniKorn](https://yunikorn.apache.org/) is a resource scheduler for Kubernetes that provides advanced batch scheduling
+capabilities, such as job queuing, resource fairness, min/max queue capacity and flexible job ordering policies.
+For available Apache YuniKorn features, please refer to [this doc](https://yunikorn.apache.org/docs/next/get_started/core_features).
+
+##### Prerequisites
+
+Install Apache YuniKorn:
+
+```bash
+helm repo add yunikorn https://apache.github.io/yunikorn-release
+helm repo update
+kubectl create namespace yunikorn
+helm install yunikorn yunikorn/yunikorn --namespace yunikorn

Review Comment:
   It seems that `YuniKorn` helm chart doesn't have `TEST SUITE`. Did I understand correctly?
   ```
   $ helm install yunikorn yunikorn/yunikorn --namespace yunikorn
   NAME: yunikorn
   LAST DEPLOYED: Tue Aug 23 08:03:37 2022
   NAMESPACE: yunikorn
   STATUS: deployed
   REVISION: 1
   TEST SUITE: None
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] yangwwei commented on pull request #37622: [SPARK-40187][DOCS] Add `Apache YuniKorn` scheduler docs

Posted by GitBox <gi...@apache.org>.
yangwwei commented on PR #37622:
URL: https://github.com/apache/spark/pull/37622#issuecomment-1226822958

   @dongjoon-hyun could you please take a look at the updated version? Here are the changes:
   
   1.  Used 1.0.0 in the installation example to comply with Spark community requirements
   2. Removed the "Work with YuniKorn queues" as that adds complexity to the doc, and for Spark users, that's not something that has to be done
   3. Addressed other review comments
   
   there is one more thing about our version doc. I have implemented some temp workaround in https://issues.apache.org/jira/browse/YUNIKORN-1293, but YuniKorn community folks made a very good point that the workaround wasn't sustainable. Since now there is only the following links in the doc:
   - https://yunikorn.apache.org/
   - https://yunikorn.apache.org/docs/get_started/core_features
   I think it should be fine that not bind to a specific version, what do you think?  YuniKorn folks also pointed out that there are similar docs like the following already:
   
   ```
   Volcano feature steps help users to create a Volcano PodGroup and set driver/executor pod annotation to link with this [PodGroup](https://volcano.sh/en/docs/podgroup/).
   ```
   
   and
   
   ```
   Volcano defines PodGroup spec using [CRD yaml](https://volcano.sh/en/docs/podgroup/#example). 
   ```
   
   please let me know your thought for this, thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #37622: [SPARK-40187][DOCS] Add Apache YuniKorn scheduler docs

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on code in PR #37622:
URL: https://github.com/apache/spark/pull/37622#discussion_r952742231


##########
docs/running-on-kubernetes.md:
##########
@@ -1811,6 +1811,50 @@ spec:
   queue: default
 ```
 
+#### Using Apache YuniKorn as Customized Scheduler for Spark on Kubernetes
+
+[Apache YuniKorn](https://yunikorn.apache.org/) is a resource scheduler for Kubernetes that provides advanced batch scheduling
+capabilities, such as job queuing, resource fairness, min/max queue capacity and flexible job ordering policies.
+For available Apache YuniKorn features, please refer to [this doc](https://yunikorn.apache.org/docs/next/get_started/core_features).
+
+##### Prerequisites

Review Comment:
   Ditto. Please use a specific YuniKorn version in the installation example.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #37622: [SPARK-40187][DOCS] Add `Apache YuniKorn` scheduler docs

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on code in PR #37622:
URL: https://github.com/apache/spark/pull/37622#discussion_r952851284


##########
docs/running-on-kubernetes.md:
##########
@@ -1811,6 +1811,50 @@ spec:
   queue: default
 ```
 
+#### Using Apache YuniKorn as Customized Scheduler for Spark on Kubernetes
+
+[Apache YuniKorn](https://yunikorn.apache.org/) is a resource scheduler for Kubernetes that provides advanced batch scheduling
+capabilities, such as job queuing, resource fairness, min/max queue capacity and flexible job ordering policies.
+For available Apache YuniKorn features, please refer to [this doc](https://yunikorn.apache.org/docs/next/get_started/core_features).
+
+##### Prerequisites
+
+Install Apache YuniKorn:
+
+```bash
+helm repo add yunikorn https://apache.github.io/yunikorn-release
+helm repo update
+kubectl create namespace yunikorn
+helm install yunikorn yunikorn/yunikorn --namespace yunikorn
+```
+
+the above steps will install the latest version of YuniKorn on an existing Kubernetes cluster.
+
+##### Get started
+
+Submit Spark jobs with the following extra options:
+
+```bash
+--conf spark.kubernetes.scheduler.name=yunikorn
+--conf spark.kubernetes.driver.annotation.yunikorn.apache.org/app-id={{APP_ID}}
+--conf spark.kubernetes.executor.annotation.yunikorn.apache.org/app-id={{APP_ID}}
+```
+
+Note, `{{APP_ID}}` is the builtin variable that will be substituted with Spark job ID automatically.
+With the above configuration, the job will be scheduled by YuniKorn scheduler instead of the default Kubernetes scheduler.
+
+##### Work with YuniKorn queues
+
+Apache YuniKorn supports 2 types of resource queues:
+
+- Static
+- Dynamic
+
+The static queues are predefined in YuniKorn configmap, and the dynamic queues are automatically created by the scheduler
+based on [placement rules](https://yunikorn.apache.org/docs/next/user_guide/placement_rules). Spark supports to run with

Review Comment:
   Please use specific version instead of `next`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #37622: [SPARK-40187][DOCS] Add `Apache YuniKorn` scheduler docs

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on code in PR #37622:
URL: https://github.com/apache/spark/pull/37622#discussion_r953076970


##########
docs/running-on-kubernetes.md:
##########
@@ -1811,6 +1811,50 @@ spec:
   queue: default
 ```
 
+#### Using Apache YuniKorn as Customized Scheduler for Spark on Kubernetes
+
+[Apache YuniKorn](https://yunikorn.apache.org/) is a resource scheduler for Kubernetes that provides advanced batch scheduling
+capabilities, such as job queuing, resource fairness, min/max queue capacity and flexible job ordering policies.
+For available Apache YuniKorn features, please refer to [this doc](https://yunikorn.apache.org/docs/next/get_started/core_features).
+
+##### Prerequisites
+
+Install Apache YuniKorn:
+
+```bash
+helm repo add yunikorn https://apache.github.io/yunikorn-release
+helm repo update
+kubectl create namespace yunikorn
+helm install yunikorn yunikorn/yunikorn --namespace yunikorn

Review Comment:
   Thanks!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] yangwwei commented on pull request #37622: [SPARK-40187][DOCS] Add `Apache YuniKorn` scheduler docs

Posted by GitBox <gi...@apache.org>.
yangwwei commented on PR #37622:
URL: https://github.com/apache/spark/pull/37622#issuecomment-1224867456

   > It seems that I wasn't clear enough to you. We need a specific version number, @yangwwei . -1 for adding a doc without version version.
   > 
   > > https://yunikorn.apache.org/docs/get_started/core_features.
   
   You are absolutely clear. Sure, let me find out to get this supported with our documentation framework. Thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #37622: [SPARK-40187][DOCS] Add `Apache YuniKorn` scheduler docs

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on code in PR #37622:
URL: https://github.com/apache/spark/pull/37622#discussion_r952861619


##########
docs/running-on-kubernetes.md:
##########
@@ -1811,6 +1811,50 @@ spec:
   queue: default
 ```
 
+#### Using Apache YuniKorn as Customized Scheduler for Spark on Kubernetes
+
+[Apache YuniKorn](https://yunikorn.apache.org/) is a resource scheduler for Kubernetes that provides advanced batch scheduling
+capabilities, such as job queuing, resource fairness, min/max queue capacity and flexible job ordering policies.
+For available Apache YuniKorn features, please refer to [this doc](https://yunikorn.apache.org/docs/next/get_started/core_features).
+
+##### Prerequisites
+
+Install Apache YuniKorn:
+
+```bash
+helm repo add yunikorn https://apache.github.io/yunikorn-release
+helm repo update
+kubectl create namespace yunikorn
+helm install yunikorn yunikorn/yunikorn --namespace yunikorn
+```
+
+the above steps will install the latest version of YuniKorn on an existing Kubernetes cluster.
+
+##### Get started
+
+Submit Spark jobs with the following extra options:
+
+```bash
+--conf spark.kubernetes.scheduler.name=yunikorn
+--conf spark.kubernetes.driver.annotation.yunikorn.apache.org/app-id={{APP_ID}}
+--conf spark.kubernetes.executor.annotation.yunikorn.apache.org/app-id={{APP_ID}}
+```
+
+Note, `{{APP_ID}}` is the builtin variable that will be substituted with Spark job ID automatically.
+With the above configuration, the job will be scheduled by YuniKorn scheduler instead of the default Kubernetes scheduler.
+
+##### Work with YuniKorn queues

Review Comment:
   This document is insufficient as a Spark document.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #37622: [SPARK-40187][DOCS] Add `Apache YuniKorn` scheduler docs

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on PR #37622:
URL: https://github.com/apache/spark/pull/37622#issuecomment-1233757469

   Sorry for being late, @yangwwei . We can move forward more based on this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #37622: [SPARK-40187][DOCS] Add `Apache YuniKorn` scheduler docs

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on PR #37622:
URL: https://github.com/apache/spark/pull/37622#issuecomment-1226831131

   In that case, why don't you add `Apache Spark Support Matrix` in `Apache YuniKorn` page?
   > I think it should be fine that not bind to a specific version, what do you think? 
   
   Specifically, I'm suggesting the following.
   - Apache Spark simply adds a forward link to `Apache YuniKorn Support Matrix` page.
   - Apache YuniKorn page (v1.0.0 or the future release) will test and maintain it from YuniKorn community side.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #37622: [SPARK-40187][DOCS] Add `Apache YuniKorn` scheduler docs

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on PR #37622:
URL: https://github.com/apache/spark/pull/37622#issuecomment-1226832661

   BTW, let me test YuniKorn according to this doc tomorrow further, @yangwwei .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org