You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Gnana Kumar <gn...@gmail.com> on 2022/11/16 15:20:02 UTC

VolcanoFeatureStep( Custom Scheduler ) not found in Spark 3.3.1 archive

Hi There,

I have installed Spark 3.3.1 and tried to use the following configuration
in Spark Submit for a spark job to run in Kubernetes Cluster and I have got
class not found exception for the reference "VolcanoFeatureStep"

--conf
spark.kubernetes.driver.pod.featureSteps=org.apache.spark.deploy.k8s.features.VolcanoFeatureStep
--conf
spark.kubernetes.executor.pod.featureSteps=org.apache.spark.deploy.k8s.features.VolcanoFeatureStep

When I unzipped the spark 3.3.1 archive, I could not see the
VolcanoFeatureStep.class file.

May I know if Volcano feature has been released in v3.3.1 ? How to resolve
this Class not found exception ?
Kindly help resolving this issue.

Thanks
Gnana

Re: VolcanoFeatureStep( Custom Scheduler ) not found in Spark 3.3.1 archive

Posted by Gnana Kumar <gn...@gmail.com>.
Hi Chris

Any ideas on resolving this issue ? Please let me know.

Thanks
Gnana

On Sun, Nov 20, 2022 at 6:32 PM Gnana Kumar <gn...@gmail.com>
wrote:

> Will there be any issue since all spark jar names contain .SNAPSHOT ? And
> not sure how it is packed within docker container.
>
> [image: image.png]
>
> On Sun, Nov 20, 2022 at 6:19 PM Gnana Kumar <gn...@gmail.com>
> wrote:
>
>> Thanks Chris for your guidance.Your maven commands have worked really.
>>
>> I have been able to build the source and generate the binary distribution
>> as well using below steps.
>>
>> >mvn -Denforcer.skip=true -DrecompileMode=all -Pkubernetes -Pvolcano
>> -Pscala-2.12 -DskipTests clean package
>>
>> >dev/./make-distribution.sh -Pkubernetes -Denforcer.skip=true
>> -DrecompileMode=all -Pkubernetes -Pvolcano -Pscala-2.12 -DskipTests
>>
>> >docker build -t spark3.3.2_gnana_volcano_scheduler_snapshot
>> -f kubernetes/dockerfiles/spark/Dockerfile .
>> >docker push spark3.3.2_gnana_volcano_scheduler_snapshot:latest
>>
>> spark_3.3.2/bin/spark-submit  \
>>            --verbose \
>>            --class com.demo.spark.SpringBootStarter  \
>>            --master k8s://https://$KUBERNETES_MASTER_IP:443 \
>>            --deploy-mode cluster \
>>            --name sparkSampleApp2 \
>>            --conf spark.kubernetes.namespace=default \
>>            --conf spark.network.timeout=300 \
>>            --conf spark.executor.instances=1 \
>>            --conf spark.kubernetes.scheduler.name=volcano \
>>            --conf
>> spark.kubernetes.scheduler.volcano.podGroupTemplateFile=/home/gnana_kumar123/spark/volcano_spark_podgroup_template_low_priority.yaml
>> \
>>            --conf
>> spark.kubernetes.driver.pod.featureSteps=org.apache.spark.deploy.k8s.features.VolcanoFeatureStep
>> \
>>            --conf
>> spark.kubernetes.executor.pod.featureSteps=org.apache.spark.deploy.k8s.features.VolcanoFeatureStep
>> \
>>
>> But unfortunately, when I run the Spark-Submit to run spark job in
>> Kubernetes Cluster, it says class not found exception.
>>
>> Please find the exception from Spark-Submit.
>>
>> Using Spark's default log4j profile:
>> org/apache/spark/log4j-defaults.properties
>> 22/11/20 12:40:52 INFO SparkKubernetesClientFactory: Auto-configuring K8S
>> client using current context from users K8S config file
>> Exception in thread "main" java.lang.ClassNotFoundException:
>> org.apache.spark.deploy.k8s.features.VolcanoFeatureStep
>>
>> So,I have extracted the jar from dist/jars and I have verified that
>> Spark-kubernetes jar has the VolcanoFeatureStep.class but I'm not sure if
>> it is really packed within  docker image.
>>
>> Thanks
>> Gnana
>>
>>
>>
>> On Sat, Nov 19, 2022 at 12:27 AM Chris Nauroth <cn...@apache.org>
>> wrote:
>>
>>> Hello Gnana,
>>>
>>> I'm bringing this thread back to the user@ list for the benefit of
>>> anyone else who might want to try this feature.
>>>
>>> Running this from the root of the source tree should give you a working
>>> full build with Kubernetes and the experimental Volcano feature, using
>>> Scala 2.12:
>>>
>>> build/mvn -Pkubernetes -Pvolcano -Pscala-2.12 -DskipTests clean package
>>>
>>> If you want to use Scala 2.13, it would be this:
>>>
>>> dev/change-scala-version.sh 2.13
>>> build/mvn -Pkubernetes -Pvolcano -Pscala-2.13 -DskipTests clean package
>>>
>>> I don't expect you'd need to replace all jars in your deployment.
>>> However, in addition to spark-kubernetes.jar, I expect you'll need to get
>>> the Volcano client classes onto the classpath. Those are in
>>> volcano-client-5.12.2.jar and volcano-model-v1beta1-5.12.2.jar.
>>>
>>> I haven't tested this new feature myself, so I don't know if there are
>>> other steps you'll hit after this. Speaking just in terms of what the build
>>> does though, this should be sufficient.
>>>
>>> I hope this helps.
>>>
>>> Chris Nauroth
>>>
>>>
>>> On Thu, Nov 17, 2022 at 11:32 PM Gnana Kumar <gn...@gmail.com>
>>> wrote:
>>>
>>>> I have maven built the spark-kubernetes jar (
>>>> spark-kubernetes_2.12-3.3.2-SNAPSHOT ) but when I build parent spark
>>>> directory , the build fails.
>>>>
>>>> mvn clean install -Denforcer.skip=true -Pvolcano -DskipTests
>>>> -Dcheckstyle.skip
>>>>
>>>> On Fri, Nov 18, 2022 at 12:47 PM Gnana Kumar <gn...@gmail.com>
>>>> wrote:
>>>>
>>>>> Also please confirm if I have to use the SNAPSHOT version of all Spark
>>>>> jars for Volcano scheduling or only kubernete jar
>>>>> (spark-kubernetes_2.12-3.3.2-SNAPSHOT.jar) is alone enough to perform
>>>>> scheduling.
>>>>>
>>>>> Thanks
>>>>> Gnana
>>>>>
>>>>> On Fri, Nov 18, 2022 at 10:27 AM Gnana Kumar <gn...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Chris,
>>>>>>
>>>>>> Thanks for the clarification.
>>>>>>
>>>>>> I have tried the below steps but getting below error. Please help me
>>>>>> to resolve this error and I would need the Volcano feature available in my
>>>>>> Spark-Kubernetes Jar.
>>>>>>
>>>>>> >git clone https://github.com/apache/spark.git -b branch-3.3
>>>>>> >cd spark
>>>>>> >mvn clean install -Denforcer.skip=true -Pvolcano -DskipTests
>>>>>>
>>>>>> [INFO]
>>>>>> ------------------------------------------------------------------------
>>>>>> [INFO] Reactor Summary for Spark Project Parent POM 3.3.2-SNAPSHOT:
>>>>>> [INFO]
>>>>>> [INFO] Spark Project Parent POM ........................... SUCCESS
>>>>>> [02:02 min]
>>>>>> [INFO] Spark Project Tags ................................. FAILURE [
>>>>>> 16.548 s]
>>>>>> [INFO] Spark Project Sketch ............................... SKIPPED
>>>>>> [INFO] Spark Project Local DB ............................. SKIPPED
>>>>>> [INFO] Spark Project Networking ........................... SKIPPED
>>>>>> [INFO] Spark Project Shuffle Streaming Service ............ SKIPPED
>>>>>> [INFO] Spark Project Unsafe ............................... SKIPPED
>>>>>> [INFO] Spark Project Launcher ............................. SKIPPED
>>>>>> [INFO] Spark Project Core ................................. SKIPPED
>>>>>> [INFO] Spark Project ML Local Library ..................... SKIPPED
>>>>>> [INFO] Spark Project GraphX ............................... SKIPPED
>>>>>> [INFO] Spark Project Streaming ............................ SKIPPED
>>>>>> [INFO] Spark Project Catalyst ............................. SKIPPED
>>>>>> [INFO] Spark Project SQL .................................. SKIPPED
>>>>>> [INFO] Spark Project ML Library ........................... SKIPPED
>>>>>> [INFO] Spark Project Tools ................................ SKIPPED
>>>>>> [INFO] Spark Project Hive ................................. SKIPPED
>>>>>> [INFO] Spark Project REPL ................................. SKIPPED
>>>>>> [INFO] Spark Project Assembly ............................. SKIPPED
>>>>>> [INFO] Kafka 0.10+ Token Provider for Streaming ........... SKIPPED
>>>>>> [INFO] Spark Integration for Kafka 0.10 ................... SKIPPED
>>>>>> [INFO] Kafka 0.10+ Source for Structured Streaming ........ SKIPPED
>>>>>> [INFO] Spark Project Examples ............................. SKIPPED
>>>>>> [INFO] Spark Integration for Kafka 0.10 Assembly .......... SKIPPED
>>>>>> [INFO] Spark Avro ......................................... SKIPPED
>>>>>> [INFO]
>>>>>> ------------------------------------------------------------------------
>>>>>> [INFO] BUILD FAILURE
>>>>>> [INFO]
>>>>>> ------------------------------------------------------------------------
>>>>>> [INFO] Total time:  02:21 min
>>>>>> [INFO] Finished at: 2022-11-18T10:23:08+05:30
>>>>>> [INFO]
>>>>>> ------------------------------------------------------------------------
>>>>>> [WARNING] The requested profile "volcano" could not be activated
>>>>>> because it does not exist.
>>>>>> [ERROR] Failed to execute goal
>>>>>> net.alchim31.maven:scala-maven-plugin:4.4.0:compile (scala-compile-first)
>>>>>> on project spark-tags_2.12: Execution scala-compile-first of goal
>>>>>> net.alchim31.maven:scala-maven-plugin:4.4.0:compile failed: An API
>>>>>> incompatibility was encountered while executing
>>>>>> net.alchim31.maven:scala-maven-plugin:4.4.0:compile:
>>>>>> java.lang.NoSuchMethodError:
>>>>>> org.fusesource.jansi.AnsiConsole.wrapOutputStream(Ljava/io/OutputStream;)Ljava/io/OutputStream;
>>>>>>
>>>>>> Thanks
>>>>>> Gnana
>>>>>>
>>>>>> On Thu, Nov 17, 2022 at 5:09 AM Chris Nauroth <cn...@apache.org>
>>>>>> wrote:
>>>>>>
>>>>>>> Hello Gnana,
>>>>>>>
>>>>>>> I think it's intentional that this is excluded from the binary
>>>>>>> release. By default, the build excludes this class [1]. It must be enabled
>>>>>>> in the build by activating a Maven profile [2]. The release script does not
>>>>>>> activate this profile [3].
>>>>>>>
>>>>>>> See the relevant pull requests ([4], [5]) for discussion of how this
>>>>>>> feature is considered experimental and therefore excluded by default from
>>>>>>> the previously GA'd 3.3 release line. If you want to use the feature, you
>>>>>>> still have the option of building from source with the -Pvolcano profile
>>>>>>> activated.
>>>>>>>
>>>>>>> [1]
>>>>>>> https://github.com/apache/spark/blob/branch-3.3/resource-managers/kubernetes/core/pom.xml#L135
>>>>>>> [2]
>>>>>>> https://github.com/apache/spark/blob/branch-3.3/resource-managers/kubernetes/core/pom.xml#L35-L54
>>>>>>> [3]
>>>>>>> https://github.com/apache/spark/blob/branch-3.3/dev/create-release/release-build.sh
>>>>>>> [4] https://github.com/apache/spark/pull/34456
>>>>>>> [5] https://github.com/apache/spark/pull/35422
>>>>>>>
>>>>>>> Chris Nauroth
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Nov 16, 2022 at 7:23 AM Gnana Kumar <
>>>>>>> gnana.kumar123@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi There,
>>>>>>>>
>>>>>>>> I have installed Spark 3.3.1 and tried to use the following
>>>>>>>> configuration in Spark Submit for a spark job to run in Kubernetes Cluster
>>>>>>>> and I have got class not found exception for the reference
>>>>>>>> "VolcanoFeatureStep"
>>>>>>>>
>>>>>>>> --conf
>>>>>>>> spark.kubernetes.driver.pod.featureSteps=org.apache.spark.deploy.k8s.features.VolcanoFeatureStep
>>>>>>>> --conf
>>>>>>>> spark.kubernetes.executor.pod.featureSteps=org.apache.spark.deploy.k8s.features.VolcanoFeatureStep
>>>>>>>>
>>>>>>>> When I unzipped the spark 3.3.1 archive, I could not see the
>>>>>>>> VolcanoFeatureStep.class file.
>>>>>>>>
>>>>>>>> May I know if Volcano feature has been released in v3.3.1 ? How to
>>>>>>>> resolve this Class not found exception ?
>>>>>>>> Kindly help resolving this issue.
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Gnana
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> Thanks
>>>>>> Gnana
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Thanks
>>>>> Gnana
>>>>>
>>>>
>>>>
>>>> --
>>>> Thanks
>>>> Gnana
>>>>
>>>
>>
>> --
>> Thanks
>> Gnana
>>
>
>
> --
> Thanks
> Gnana
>


-- 
Thanks
Gnana

Re: VolcanoFeatureStep( Custom Scheduler ) not found in Spark 3.3.1 archive

Posted by Gnana Kumar <gn...@gmail.com>.
Will there be any issue since all spark jar names contain .SNAPSHOT ? And
not sure how it is packed within docker container.

[image: image.png]

On Sun, Nov 20, 2022 at 6:19 PM Gnana Kumar <gn...@gmail.com>
wrote:

> Thanks Chris for your guidance.Your maven commands have worked really.
>
> I have been able to build the source and generate the binary distribution
> as well using below steps.
>
> >mvn -Denforcer.skip=true -DrecompileMode=all -Pkubernetes -Pvolcano
> -Pscala-2.12 -DskipTests clean package
>
> >dev/./make-distribution.sh -Pkubernetes -Denforcer.skip=true
> -DrecompileMode=all -Pkubernetes -Pvolcano -Pscala-2.12 -DskipTests
>
> >docker build -t spark3.3.2_gnana_volcano_scheduler_snapshot
> -f kubernetes/dockerfiles/spark/Dockerfile .
> >docker push spark3.3.2_gnana_volcano_scheduler_snapshot:latest
>
> spark_3.3.2/bin/spark-submit  \
>            --verbose \
>            --class com.demo.spark.SpringBootStarter  \
>            --master k8s://https://$KUBERNETES_MASTER_IP:443 \
>            --deploy-mode cluster \
>            --name sparkSampleApp2 \
>            --conf spark.kubernetes.namespace=default \
>            --conf spark.network.timeout=300 \
>            --conf spark.executor.instances=1 \
>            --conf spark.kubernetes.scheduler.name=volcano \
>            --conf
> spark.kubernetes.scheduler.volcano.podGroupTemplateFile=/home/gnana_kumar123/spark/volcano_spark_podgroup_template_low_priority.yaml
> \
>            --conf
> spark.kubernetes.driver.pod.featureSteps=org.apache.spark.deploy.k8s.features.VolcanoFeatureStep
> \
>            --conf
> spark.kubernetes.executor.pod.featureSteps=org.apache.spark.deploy.k8s.features.VolcanoFeatureStep
> \
>
> But unfortunately, when I run the Spark-Submit to run spark job in
> Kubernetes Cluster, it says class not found exception.
>
> Please find the exception from Spark-Submit.
>
> Using Spark's default log4j profile:
> org/apache/spark/log4j-defaults.properties
> 22/11/20 12:40:52 INFO SparkKubernetesClientFactory: Auto-configuring K8S
> client using current context from users K8S config file
> Exception in thread "main" java.lang.ClassNotFoundException:
> org.apache.spark.deploy.k8s.features.VolcanoFeatureStep
>
> So,I have extracted the jar from dist/jars and I have verified that
> Spark-kubernetes jar has the VolcanoFeatureStep.class but I'm not sure if
> it is really packed within  docker image.
>
> Thanks
> Gnana
>
>
>
> On Sat, Nov 19, 2022 at 12:27 AM Chris Nauroth <cn...@apache.org>
> wrote:
>
>> Hello Gnana,
>>
>> I'm bringing this thread back to the user@ list for the benefit of
>> anyone else who might want to try this feature.
>>
>> Running this from the root of the source tree should give you a working
>> full build with Kubernetes and the experimental Volcano feature, using
>> Scala 2.12:
>>
>> build/mvn -Pkubernetes -Pvolcano -Pscala-2.12 -DskipTests clean package
>>
>> If you want to use Scala 2.13, it would be this:
>>
>> dev/change-scala-version.sh 2.13
>> build/mvn -Pkubernetes -Pvolcano -Pscala-2.13 -DskipTests clean package
>>
>> I don't expect you'd need to replace all jars in your deployment.
>> However, in addition to spark-kubernetes.jar, I expect you'll need to get
>> the Volcano client classes onto the classpath. Those are in
>> volcano-client-5.12.2.jar and volcano-model-v1beta1-5.12.2.jar.
>>
>> I haven't tested this new feature myself, so I don't know if there are
>> other steps you'll hit after this. Speaking just in terms of what the build
>> does though, this should be sufficient.
>>
>> I hope this helps.
>>
>> Chris Nauroth
>>
>>
>> On Thu, Nov 17, 2022 at 11:32 PM Gnana Kumar <gn...@gmail.com>
>> wrote:
>>
>>> I have maven built the spark-kubernetes jar (
>>> spark-kubernetes_2.12-3.3.2-SNAPSHOT ) but when I build parent spark
>>> directory , the build fails.
>>>
>>> mvn clean install -Denforcer.skip=true -Pvolcano -DskipTests
>>> -Dcheckstyle.skip
>>>
>>> On Fri, Nov 18, 2022 at 12:47 PM Gnana Kumar <gn...@gmail.com>
>>> wrote:
>>>
>>>> Also please confirm if I have to use the SNAPSHOT version of all Spark
>>>> jars for Volcano scheduling or only kubernete jar
>>>> (spark-kubernetes_2.12-3.3.2-SNAPSHOT.jar) is alone enough to perform
>>>> scheduling.
>>>>
>>>> Thanks
>>>> Gnana
>>>>
>>>> On Fri, Nov 18, 2022 at 10:27 AM Gnana Kumar <gn...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Chris,
>>>>>
>>>>> Thanks for the clarification.
>>>>>
>>>>> I have tried the below steps but getting below error. Please help me
>>>>> to resolve this error and I would need the Volcano feature available in my
>>>>> Spark-Kubernetes Jar.
>>>>>
>>>>> >git clone https://github.com/apache/spark.git -b branch-3.3
>>>>> >cd spark
>>>>> >mvn clean install -Denforcer.skip=true -Pvolcano -DskipTests
>>>>>
>>>>> [INFO]
>>>>> ------------------------------------------------------------------------
>>>>> [INFO] Reactor Summary for Spark Project Parent POM 3.3.2-SNAPSHOT:
>>>>> [INFO]
>>>>> [INFO] Spark Project Parent POM ........................... SUCCESS
>>>>> [02:02 min]
>>>>> [INFO] Spark Project Tags ................................. FAILURE [
>>>>> 16.548 s]
>>>>> [INFO] Spark Project Sketch ............................... SKIPPED
>>>>> [INFO] Spark Project Local DB ............................. SKIPPED
>>>>> [INFO] Spark Project Networking ........................... SKIPPED
>>>>> [INFO] Spark Project Shuffle Streaming Service ............ SKIPPED
>>>>> [INFO] Spark Project Unsafe ............................... SKIPPED
>>>>> [INFO] Spark Project Launcher ............................. SKIPPED
>>>>> [INFO] Spark Project Core ................................. SKIPPED
>>>>> [INFO] Spark Project ML Local Library ..................... SKIPPED
>>>>> [INFO] Spark Project GraphX ............................... SKIPPED
>>>>> [INFO] Spark Project Streaming ............................ SKIPPED
>>>>> [INFO] Spark Project Catalyst ............................. SKIPPED
>>>>> [INFO] Spark Project SQL .................................. SKIPPED
>>>>> [INFO] Spark Project ML Library ........................... SKIPPED
>>>>> [INFO] Spark Project Tools ................................ SKIPPED
>>>>> [INFO] Spark Project Hive ................................. SKIPPED
>>>>> [INFO] Spark Project REPL ................................. SKIPPED
>>>>> [INFO] Spark Project Assembly ............................. SKIPPED
>>>>> [INFO] Kafka 0.10+ Token Provider for Streaming ........... SKIPPED
>>>>> [INFO] Spark Integration for Kafka 0.10 ................... SKIPPED
>>>>> [INFO] Kafka 0.10+ Source for Structured Streaming ........ SKIPPED
>>>>> [INFO] Spark Project Examples ............................. SKIPPED
>>>>> [INFO] Spark Integration for Kafka 0.10 Assembly .......... SKIPPED
>>>>> [INFO] Spark Avro ......................................... SKIPPED
>>>>> [INFO]
>>>>> ------------------------------------------------------------------------
>>>>> [INFO] BUILD FAILURE
>>>>> [INFO]
>>>>> ------------------------------------------------------------------------
>>>>> [INFO] Total time:  02:21 min
>>>>> [INFO] Finished at: 2022-11-18T10:23:08+05:30
>>>>> [INFO]
>>>>> ------------------------------------------------------------------------
>>>>> [WARNING] The requested profile "volcano" could not be activated
>>>>> because it does not exist.
>>>>> [ERROR] Failed to execute goal
>>>>> net.alchim31.maven:scala-maven-plugin:4.4.0:compile (scala-compile-first)
>>>>> on project spark-tags_2.12: Execution scala-compile-first of goal
>>>>> net.alchim31.maven:scala-maven-plugin:4.4.0:compile failed: An API
>>>>> incompatibility was encountered while executing
>>>>> net.alchim31.maven:scala-maven-plugin:4.4.0:compile:
>>>>> java.lang.NoSuchMethodError:
>>>>> org.fusesource.jansi.AnsiConsole.wrapOutputStream(Ljava/io/OutputStream;)Ljava/io/OutputStream;
>>>>>
>>>>> Thanks
>>>>> Gnana
>>>>>
>>>>> On Thu, Nov 17, 2022 at 5:09 AM Chris Nauroth <cn...@apache.org>
>>>>> wrote:
>>>>>
>>>>>> Hello Gnana,
>>>>>>
>>>>>> I think it's intentional that this is excluded from the binary
>>>>>> release. By default, the build excludes this class [1]. It must be enabled
>>>>>> in the build by activating a Maven profile [2]. The release script does not
>>>>>> activate this profile [3].
>>>>>>
>>>>>> See the relevant pull requests ([4], [5]) for discussion of how this
>>>>>> feature is considered experimental and therefore excluded by default from
>>>>>> the previously GA'd 3.3 release line. If you want to use the feature, you
>>>>>> still have the option of building from source with the -Pvolcano profile
>>>>>> activated.
>>>>>>
>>>>>> [1]
>>>>>> https://github.com/apache/spark/blob/branch-3.3/resource-managers/kubernetes/core/pom.xml#L135
>>>>>> [2]
>>>>>> https://github.com/apache/spark/blob/branch-3.3/resource-managers/kubernetes/core/pom.xml#L35-L54
>>>>>> [3]
>>>>>> https://github.com/apache/spark/blob/branch-3.3/dev/create-release/release-build.sh
>>>>>> [4] https://github.com/apache/spark/pull/34456
>>>>>> [5] https://github.com/apache/spark/pull/35422
>>>>>>
>>>>>> Chris Nauroth
>>>>>>
>>>>>>
>>>>>> On Wed, Nov 16, 2022 at 7:23 AM Gnana Kumar <gn...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi There,
>>>>>>>
>>>>>>> I have installed Spark 3.3.1 and tried to use the following
>>>>>>> configuration in Spark Submit for a spark job to run in Kubernetes Cluster
>>>>>>> and I have got class not found exception for the reference
>>>>>>> "VolcanoFeatureStep"
>>>>>>>
>>>>>>> --conf
>>>>>>> spark.kubernetes.driver.pod.featureSteps=org.apache.spark.deploy.k8s.features.VolcanoFeatureStep
>>>>>>> --conf
>>>>>>> spark.kubernetes.executor.pod.featureSteps=org.apache.spark.deploy.k8s.features.VolcanoFeatureStep
>>>>>>>
>>>>>>> When I unzipped the spark 3.3.1 archive, I could not see the
>>>>>>> VolcanoFeatureStep.class file.
>>>>>>>
>>>>>>> May I know if Volcano feature has been released in v3.3.1 ? How to
>>>>>>> resolve this Class not found exception ?
>>>>>>> Kindly help resolving this issue.
>>>>>>>
>>>>>>> Thanks
>>>>>>> Gnana
>>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> Thanks
>>>>> Gnana
>>>>>
>>>>
>>>>
>>>> --
>>>> Thanks
>>>> Gnana
>>>>
>>>
>>>
>>> --
>>> Thanks
>>> Gnana
>>>
>>
>
> --
> Thanks
> Gnana
>


-- 
Thanks
Gnana

Re: VolcanoFeatureStep( Custom Scheduler ) not found in Spark 3.3.1 archive

Posted by Gnana Kumar <gn...@gmail.com>.
Thanks Chris for your guidance.Your maven commands have worked really.

I have been able to build the source and generate the binary distribution
as well using below steps.

>mvn -Denforcer.skip=true -DrecompileMode=all -Pkubernetes -Pvolcano
-Pscala-2.12 -DskipTests clean package

>dev/./make-distribution.sh -Pkubernetes -Denforcer.skip=true
-DrecompileMode=all -Pkubernetes -Pvolcano -Pscala-2.12 -DskipTests

>docker build -t spark3.3.2_gnana_volcano_scheduler_snapshot
-f kubernetes/dockerfiles/spark/Dockerfile .
>docker push spark3.3.2_gnana_volcano_scheduler_snapshot:latest

spark_3.3.2/bin/spark-submit  \
           --verbose \
           --class com.demo.spark.SpringBootStarter  \
           --master k8s://https://$KUBERNETES_MASTER_IP:443 \
           --deploy-mode cluster \
           --name sparkSampleApp2 \
           --conf spark.kubernetes.namespace=default \
           --conf spark.network.timeout=300 \
           --conf spark.executor.instances=1 \
           --conf spark.kubernetes.scheduler.name=volcano \
           --conf
spark.kubernetes.scheduler.volcano.podGroupTemplateFile=/home/gnana_kumar123/spark/volcano_spark_podgroup_template_low_priority.yaml
\
           --conf
spark.kubernetes.driver.pod.featureSteps=org.apache.spark.deploy.k8s.features.VolcanoFeatureStep
\
           --conf
spark.kubernetes.executor.pod.featureSteps=org.apache.spark.deploy.k8s.features.VolcanoFeatureStep
\

But unfortunately, when I run the Spark-Submit to run spark job in
Kubernetes Cluster, it says class not found exception.

Please find the exception from Spark-Submit.

Using Spark's default log4j profile:
org/apache/spark/log4j-defaults.properties
22/11/20 12:40:52 INFO SparkKubernetesClientFactory: Auto-configuring K8S
client using current context from users K8S config file
Exception in thread "main" java.lang.ClassNotFoundException:
org.apache.spark.deploy.k8s.features.VolcanoFeatureStep

So,I have extracted the jar from dist/jars and I have verified that
Spark-kubernetes jar has the VolcanoFeatureStep.class but I'm not sure if
it is really packed within  docker image.

Thanks
Gnana



On Sat, Nov 19, 2022 at 12:27 AM Chris Nauroth <cn...@apache.org> wrote:

> Hello Gnana,
>
> I'm bringing this thread back to the user@ list for the benefit of anyone
> else who might want to try this feature.
>
> Running this from the root of the source tree should give you a working
> full build with Kubernetes and the experimental Volcano feature, using
> Scala 2.12:
>
> build/mvn -Pkubernetes -Pvolcano -Pscala-2.12 -DskipTests clean package
>
> If you want to use Scala 2.13, it would be this:
>
> dev/change-scala-version.sh 2.13
> build/mvn -Pkubernetes -Pvolcano -Pscala-2.13 -DskipTests clean package
>
> I don't expect you'd need to replace all jars in your deployment. However,
> in addition to spark-kubernetes.jar, I expect you'll need to get the
> Volcano client classes onto the classpath. Those are in
> volcano-client-5.12.2.jar and volcano-model-v1beta1-5.12.2.jar.
>
> I haven't tested this new feature myself, so I don't know if there are
> other steps you'll hit after this. Speaking just in terms of what the build
> does though, this should be sufficient.
>
> I hope this helps.
>
> Chris Nauroth
>
>
> On Thu, Nov 17, 2022 at 11:32 PM Gnana Kumar <gn...@gmail.com>
> wrote:
>
>> I have maven built the spark-kubernetes jar (
>> spark-kubernetes_2.12-3.3.2-SNAPSHOT ) but when I build parent spark
>> directory , the build fails.
>>
>> mvn clean install -Denforcer.skip=true -Pvolcano -DskipTests
>> -Dcheckstyle.skip
>>
>> On Fri, Nov 18, 2022 at 12:47 PM Gnana Kumar <gn...@gmail.com>
>> wrote:
>>
>>> Also please confirm if I have to use the SNAPSHOT version of all Spark
>>> jars for Volcano scheduling or only kubernete jar
>>> (spark-kubernetes_2.12-3.3.2-SNAPSHOT.jar) is alone enough to perform
>>> scheduling.
>>>
>>> Thanks
>>> Gnana
>>>
>>> On Fri, Nov 18, 2022 at 10:27 AM Gnana Kumar <gn...@gmail.com>
>>> wrote:
>>>
>>>> Hi Chris,
>>>>
>>>> Thanks for the clarification.
>>>>
>>>> I have tried the below steps but getting below error. Please help me to
>>>> resolve this error and I would need the Volcano feature available in my
>>>> Spark-Kubernetes Jar.
>>>>
>>>> >git clone https://github.com/apache/spark.git -b branch-3.3
>>>> >cd spark
>>>> >mvn clean install -Denforcer.skip=true -Pvolcano -DskipTests
>>>>
>>>> [INFO]
>>>> ------------------------------------------------------------------------
>>>> [INFO] Reactor Summary for Spark Project Parent POM 3.3.2-SNAPSHOT:
>>>> [INFO]
>>>> [INFO] Spark Project Parent POM ........................... SUCCESS
>>>> [02:02 min]
>>>> [INFO] Spark Project Tags ................................. FAILURE [
>>>> 16.548 s]
>>>> [INFO] Spark Project Sketch ............................... SKIPPED
>>>> [INFO] Spark Project Local DB ............................. SKIPPED
>>>> [INFO] Spark Project Networking ........................... SKIPPED
>>>> [INFO] Spark Project Shuffle Streaming Service ............ SKIPPED
>>>> [INFO] Spark Project Unsafe ............................... SKIPPED
>>>> [INFO] Spark Project Launcher ............................. SKIPPED
>>>> [INFO] Spark Project Core ................................. SKIPPED
>>>> [INFO] Spark Project ML Local Library ..................... SKIPPED
>>>> [INFO] Spark Project GraphX ............................... SKIPPED
>>>> [INFO] Spark Project Streaming ............................ SKIPPED
>>>> [INFO] Spark Project Catalyst ............................. SKIPPED
>>>> [INFO] Spark Project SQL .................................. SKIPPED
>>>> [INFO] Spark Project ML Library ........................... SKIPPED
>>>> [INFO] Spark Project Tools ................................ SKIPPED
>>>> [INFO] Spark Project Hive ................................. SKIPPED
>>>> [INFO] Spark Project REPL ................................. SKIPPED
>>>> [INFO] Spark Project Assembly ............................. SKIPPED
>>>> [INFO] Kafka 0.10+ Token Provider for Streaming ........... SKIPPED
>>>> [INFO] Spark Integration for Kafka 0.10 ................... SKIPPED
>>>> [INFO] Kafka 0.10+ Source for Structured Streaming ........ SKIPPED
>>>> [INFO] Spark Project Examples ............................. SKIPPED
>>>> [INFO] Spark Integration for Kafka 0.10 Assembly .......... SKIPPED
>>>> [INFO] Spark Avro ......................................... SKIPPED
>>>> [INFO]
>>>> ------------------------------------------------------------------------
>>>> [INFO] BUILD FAILURE
>>>> [INFO]
>>>> ------------------------------------------------------------------------
>>>> [INFO] Total time:  02:21 min
>>>> [INFO] Finished at: 2022-11-18T10:23:08+05:30
>>>> [INFO]
>>>> ------------------------------------------------------------------------
>>>> [WARNING] The requested profile "volcano" could not be activated
>>>> because it does not exist.
>>>> [ERROR] Failed to execute goal
>>>> net.alchim31.maven:scala-maven-plugin:4.4.0:compile (scala-compile-first)
>>>> on project spark-tags_2.12: Execution scala-compile-first of goal
>>>> net.alchim31.maven:scala-maven-plugin:4.4.0:compile failed: An API
>>>> incompatibility was encountered while executing
>>>> net.alchim31.maven:scala-maven-plugin:4.4.0:compile:
>>>> java.lang.NoSuchMethodError:
>>>> org.fusesource.jansi.AnsiConsole.wrapOutputStream(Ljava/io/OutputStream;)Ljava/io/OutputStream;
>>>>
>>>> Thanks
>>>> Gnana
>>>>
>>>> On Thu, Nov 17, 2022 at 5:09 AM Chris Nauroth <cn...@apache.org>
>>>> wrote:
>>>>
>>>>> Hello Gnana,
>>>>>
>>>>> I think it's intentional that this is excluded from the binary
>>>>> release. By default, the build excludes this class [1]. It must be enabled
>>>>> in the build by activating a Maven profile [2]. The release script does not
>>>>> activate this profile [3].
>>>>>
>>>>> See the relevant pull requests ([4], [5]) for discussion of how this
>>>>> feature is considered experimental and therefore excluded by default from
>>>>> the previously GA'd 3.3 release line. If you want to use the feature, you
>>>>> still have the option of building from source with the -Pvolcano profile
>>>>> activated.
>>>>>
>>>>> [1]
>>>>> https://github.com/apache/spark/blob/branch-3.3/resource-managers/kubernetes/core/pom.xml#L135
>>>>> [2]
>>>>> https://github.com/apache/spark/blob/branch-3.3/resource-managers/kubernetes/core/pom.xml#L35-L54
>>>>> [3]
>>>>> https://github.com/apache/spark/blob/branch-3.3/dev/create-release/release-build.sh
>>>>> [4] https://github.com/apache/spark/pull/34456
>>>>> [5] https://github.com/apache/spark/pull/35422
>>>>>
>>>>> Chris Nauroth
>>>>>
>>>>>
>>>>> On Wed, Nov 16, 2022 at 7:23 AM Gnana Kumar <gn...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi There,
>>>>>>
>>>>>> I have installed Spark 3.3.1 and tried to use the following
>>>>>> configuration in Spark Submit for a spark job to run in Kubernetes Cluster
>>>>>> and I have got class not found exception for the reference
>>>>>> "VolcanoFeatureStep"
>>>>>>
>>>>>> --conf
>>>>>> spark.kubernetes.driver.pod.featureSteps=org.apache.spark.deploy.k8s.features.VolcanoFeatureStep
>>>>>> --conf
>>>>>> spark.kubernetes.executor.pod.featureSteps=org.apache.spark.deploy.k8s.features.VolcanoFeatureStep
>>>>>>
>>>>>> When I unzipped the spark 3.3.1 archive, I could not see the
>>>>>> VolcanoFeatureStep.class file.
>>>>>>
>>>>>> May I know if Volcano feature has been released in v3.3.1 ? How to
>>>>>> resolve this Class not found exception ?
>>>>>> Kindly help resolving this issue.
>>>>>>
>>>>>> Thanks
>>>>>> Gnana
>>>>>>
>>>>>
>>>>
>>>> --
>>>> Thanks
>>>> Gnana
>>>>
>>>
>>>
>>> --
>>> Thanks
>>> Gnana
>>>
>>
>>
>> --
>> Thanks
>> Gnana
>>
>

-- 
Thanks
Gnana

Re: VolcanoFeatureStep( Custom Scheduler ) not found in Spark 3.3.1 archive

Posted by Chris Nauroth <cn...@apache.org>.
Hello Gnana,

I'm bringing this thread back to the user@ list for the benefit of anyone
else who might want to try this feature.

Running this from the root of the source tree should give you a working
full build with Kubernetes and the experimental Volcano feature, using
Scala 2.12:

build/mvn -Pkubernetes -Pvolcano -Pscala-2.12 -DskipTests clean package

If you want to use Scala 2.13, it would be this:

dev/change-scala-version.sh 2.13
build/mvn -Pkubernetes -Pvolcano -Pscala-2.13 -DskipTests clean package

I don't expect you'd need to replace all jars in your deployment. However,
in addition to spark-kubernetes.jar, I expect you'll need to get the
Volcano client classes onto the classpath. Those are in
volcano-client-5.12.2.jar and volcano-model-v1beta1-5.12.2.jar.

I haven't tested this new feature myself, so I don't know if there are
other steps you'll hit after this. Speaking just in terms of what the build
does though, this should be sufficient.

I hope this helps.

Chris Nauroth


On Thu, Nov 17, 2022 at 11:32 PM Gnana Kumar <gn...@gmail.com>
wrote:

> I have maven built the spark-kubernetes jar (
> spark-kubernetes_2.12-3.3.2-SNAPSHOT ) but when I build parent spark
> directory , the build fails.
>
> mvn clean install -Denforcer.skip=true -Pvolcano -DskipTests
> -Dcheckstyle.skip
>
> On Fri, Nov 18, 2022 at 12:47 PM Gnana Kumar <gn...@gmail.com>
> wrote:
>
>> Also please confirm if I have to use the SNAPSHOT version of all Spark
>> jars for Volcano scheduling or only kubernete jar
>> (spark-kubernetes_2.12-3.3.2-SNAPSHOT.jar) is alone enough to perform
>> scheduling.
>>
>> Thanks
>> Gnana
>>
>> On Fri, Nov 18, 2022 at 10:27 AM Gnana Kumar <gn...@gmail.com>
>> wrote:
>>
>>> Hi Chris,
>>>
>>> Thanks for the clarification.
>>>
>>> I have tried the below steps but getting below error. Please help me to
>>> resolve this error and I would need the Volcano feature available in my
>>> Spark-Kubernetes Jar.
>>>
>>> >git clone https://github.com/apache/spark.git -b branch-3.3
>>> >cd spark
>>> >mvn clean install -Denforcer.skip=true -Pvolcano -DskipTests
>>>
>>> [INFO]
>>> ------------------------------------------------------------------------
>>> [INFO] Reactor Summary for Spark Project Parent POM 3.3.2-SNAPSHOT:
>>> [INFO]
>>> [INFO] Spark Project Parent POM ........................... SUCCESS
>>> [02:02 min]
>>> [INFO] Spark Project Tags ................................. FAILURE [
>>> 16.548 s]
>>> [INFO] Spark Project Sketch ............................... SKIPPED
>>> [INFO] Spark Project Local DB ............................. SKIPPED
>>> [INFO] Spark Project Networking ........................... SKIPPED
>>> [INFO] Spark Project Shuffle Streaming Service ............ SKIPPED
>>> [INFO] Spark Project Unsafe ............................... SKIPPED
>>> [INFO] Spark Project Launcher ............................. SKIPPED
>>> [INFO] Spark Project Core ................................. SKIPPED
>>> [INFO] Spark Project ML Local Library ..................... SKIPPED
>>> [INFO] Spark Project GraphX ............................... SKIPPED
>>> [INFO] Spark Project Streaming ............................ SKIPPED
>>> [INFO] Spark Project Catalyst ............................. SKIPPED
>>> [INFO] Spark Project SQL .................................. SKIPPED
>>> [INFO] Spark Project ML Library ........................... SKIPPED
>>> [INFO] Spark Project Tools ................................ SKIPPED
>>> [INFO] Spark Project Hive ................................. SKIPPED
>>> [INFO] Spark Project REPL ................................. SKIPPED
>>> [INFO] Spark Project Assembly ............................. SKIPPED
>>> [INFO] Kafka 0.10+ Token Provider for Streaming ........... SKIPPED
>>> [INFO] Spark Integration for Kafka 0.10 ................... SKIPPED
>>> [INFO] Kafka 0.10+ Source for Structured Streaming ........ SKIPPED
>>> [INFO] Spark Project Examples ............................. SKIPPED
>>> [INFO] Spark Integration for Kafka 0.10 Assembly .......... SKIPPED
>>> [INFO] Spark Avro ......................................... SKIPPED
>>> [INFO]
>>> ------------------------------------------------------------------------
>>> [INFO] BUILD FAILURE
>>> [INFO]
>>> ------------------------------------------------------------------------
>>> [INFO] Total time:  02:21 min
>>> [INFO] Finished at: 2022-11-18T10:23:08+05:30
>>> [INFO]
>>> ------------------------------------------------------------------------
>>> [WARNING] The requested profile "volcano" could not be activated because
>>> it does not exist.
>>> [ERROR] Failed to execute goal
>>> net.alchim31.maven:scala-maven-plugin:4.4.0:compile (scala-compile-first)
>>> on project spark-tags_2.12: Execution scala-compile-first of goal
>>> net.alchim31.maven:scala-maven-plugin:4.4.0:compile failed: An API
>>> incompatibility was encountered while executing
>>> net.alchim31.maven:scala-maven-plugin:4.4.0:compile:
>>> java.lang.NoSuchMethodError:
>>> org.fusesource.jansi.AnsiConsole.wrapOutputStream(Ljava/io/OutputStream;)Ljava/io/OutputStream;
>>>
>>> Thanks
>>> Gnana
>>>
>>> On Thu, Nov 17, 2022 at 5:09 AM Chris Nauroth <cn...@apache.org>
>>> wrote:
>>>
>>>> Hello Gnana,
>>>>
>>>> I think it's intentional that this is excluded from the binary release.
>>>> By default, the build excludes this class [1]. It must be enabled in the
>>>> build by activating a Maven profile [2]. The release script does not
>>>> activate this profile [3].
>>>>
>>>> See the relevant pull requests ([4], [5]) for discussion of how this
>>>> feature is considered experimental and therefore excluded by default from
>>>> the previously GA'd 3.3 release line. If you want to use the feature, you
>>>> still have the option of building from source with the -Pvolcano profile
>>>> activated.
>>>>
>>>> [1]
>>>> https://github.com/apache/spark/blob/branch-3.3/resource-managers/kubernetes/core/pom.xml#L135
>>>> [2]
>>>> https://github.com/apache/spark/blob/branch-3.3/resource-managers/kubernetes/core/pom.xml#L35-L54
>>>> [3]
>>>> https://github.com/apache/spark/blob/branch-3.3/dev/create-release/release-build.sh
>>>> [4] https://github.com/apache/spark/pull/34456
>>>> [5] https://github.com/apache/spark/pull/35422
>>>>
>>>> Chris Nauroth
>>>>
>>>>
>>>> On Wed, Nov 16, 2022 at 7:23 AM Gnana Kumar <gn...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi There,
>>>>>
>>>>> I have installed Spark 3.3.1 and tried to use the following
>>>>> configuration in Spark Submit for a spark job to run in Kubernetes Cluster
>>>>> and I have got class not found exception for the reference
>>>>> "VolcanoFeatureStep"
>>>>>
>>>>> --conf
>>>>> spark.kubernetes.driver.pod.featureSteps=org.apache.spark.deploy.k8s.features.VolcanoFeatureStep
>>>>> --conf
>>>>> spark.kubernetes.executor.pod.featureSteps=org.apache.spark.deploy.k8s.features.VolcanoFeatureStep
>>>>>
>>>>> When I unzipped the spark 3.3.1 archive, I could not see the
>>>>> VolcanoFeatureStep.class file.
>>>>>
>>>>> May I know if Volcano feature has been released in v3.3.1 ? How to
>>>>> resolve this Class not found exception ?
>>>>> Kindly help resolving this issue.
>>>>>
>>>>> Thanks
>>>>> Gnana
>>>>>
>>>>
>>>
>>> --
>>> Thanks
>>> Gnana
>>>
>>
>>
>> --
>> Thanks
>> Gnana
>>
>
>
> --
> Thanks
> Gnana
>