You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Gnana Kumar <gn...@gmail.com> on 2022/02/17 12:15:23 UTC

Fwd: Spark 3.2.1 in Google Kubernetes Version 1.19 or 1.21 - SparkSubmit Error

Hi There,

I'm getting below error though I pass --class and --jars values
while submitting a spark job through Spark-Submit.
Please help.

Exception in thread "main" org.apache.spark.SparkException: Failed to get
main class in JAR with error 'File file:/home/gnana_kumar123/spark/  does
not exist'.  Please specify one with --class.
        at org.apache.spark.deploy.SparkSubmit.error(SparkSubmit.scala:972)
        at
org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:486)
        at org.apache.spark.deploy.SparkSubmit.org
$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:898)
        at
org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
        at
org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
        at
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1043)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1052)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)



~/spark/spark-3.2.1-bin-hadoop3.2/bin/spark-submit \
           --master k8s://${K8S_SERVER}:443 \
           --deploy-mode cluster \
           --name sparkBQ \
           --conf spark.kubernetes.namespace=$NAMESPACE \
           --conf spark.network.timeout=300 \
           --conf spark.executor.instances=3 \
           --conf spark.kubernetes.allocation.batch.size=3 \
           --conf spark.kubernetes.allocation.batch.delay=1 \
           --conf spark.driver.cores=3 \
           --conf spark.executor.cores=3 \
           --conf spark.driver.memory=8092m \
           --conf spark.executor.memory=8092m \
           --conf spark.dynamicAllocation.enabled=true \
           --conf spark.dynamicAllocation.shuffleTracking.enabled=true \
           --conf spark.kubernetes.driver.container.image=${SPARK_IMAGE} \
           --conf spark.kubernetes.executor.container.image=${SPARK_IMAGE} \
           --conf
spark.kubernetes.authenticate.driver.serviceAccountName=spark \
           --conf spark.driver.extraJavaOptions=
"-Dio.netty.tryReflectionSetAccessible=true" \
           --conf spark.executor.extraJavaOptions=
"-Dio.netty.tryReflectionSetAccessible=true" \
           --class org.apache.spark.examples.SparkPi \
           --jars
/home/gnana_kumar123/spark/spark-3.2.1-bin-hadoop3.2/examples/jars/spark-examples_2.12-3.2.1.jar

Thanks
GK



On Wed, Feb 16, 2022 at 11:11 PM Gnana Kumar <gn...@gmail.com>
wrote:

> Hi Mich
>
> Also I would like to run Spark nodes ( Master and Worker nodes in
> Kubernetes) and then run my Java Spark application from a JAR file.
>
> Can you please let me know how to specify the JAR file and the MAIN class.
>
> Thanks
> GK
>
> On Wed, Feb 16, 2022 at 10:36 PM Gnana Kumar <gn...@gmail.com>
> wrote:
>
>> Hi Mich,
>>
>> I have built the image using the Dockerfile present
>> in spark-3.2.1-bin-hadoop3.2.tgz.
>>
>> Also I have pushed the same image to my docker hub account ie.
>> docker.io/gnanakumar123/spark3.2.1:latest
>>
>> I believe spark submit can pull image from docker hub when I run from
>> GKE's Cloud Shell. Please confirm.
>>
>> Below is the command I'm running.
>>
>> ./spark-submit \
>>   --master k8s://$K8S_SERVER \
>>   --deploy-mode cluster \
>>   --name spark-driver-pod \
>>   --class org.apache.spark.examples.SparkPi \
>>   --conf spark.executor.instances=2 \
>>   --conf spark.kubernetes.driver.container.image=
>> docker.io/gnanakumar123/spark3.2.1:latest \
>>   --conf spark.kubernetes.executor.container.image=
>> docker.io/gnanakumar123/spark3.2.1:latest \
>>   --conf spark.kubernetes.container.image=
>> docker.io/gnanakumar123/spark3.2.1:latest \
>>   --conf spark.kubernetes.driver.pod.name=spark-driver-pod \
>>   --conf spark.kubernetes.namespace=spark-demo \
>>   --conf spark.kubernetes.container.image.pullPolicy=Never \
>>   --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
>>     $SPARK_HOME/examples/jars/spark-examples_2.12-3.2.1.jar
>>
>> Thanks
>> GK
>>
>>
>> On Mon, Feb 14, 2022 at 10:50 PM Mich Talebzadeh <
>> mich.talebzadeh@gmail.com> wrote:
>>
>>> Hi
>>>
>>>
>>> It is complaining about the missing driver container image. Does
>>> $SPARK_IMAGE point to a valid image in the GCP container registry?
>>>
>>> Example for a docker image for PySpark
>>>
>>>
>>> IMAGEDRIVER="eu.gcr.io/
>>> <PROJRECT>/spark-py:3.1.1-scala_2.12-8-jre-slim-buster-java8PlusPackages"
>>>
>>>
>>>         spark-submit --verbose \
>>>
>>>            --properties-file ${property_file} \
>>>
>>>            --master k8s://https://$KUBERNETES_MASTER_IP:443 \
>>>
>>>            --deploy-mode cluster \
>>>
>>>            --name sparkBQ \
>>>
>>>            --py-files $CODE_DIRECTORY_CLOUD/spark_on_gke.zip \
>>>
>>>            --conf spark.kubernetes.namespace=$NAMESPACE \
>>>
>>>            --conf spark.network.timeout=300 \
>>>
>>>            --conf spark.executor.instances=$NEXEC \
>>>
>>>            --conf spark.kubernetes.allocation.batch.size=3 \
>>>
>>>            --conf spark.kubernetes.allocation.batch.delay=1 \
>>>
>>>            --conf spark.driver.cores=3 \
>>>
>>>            --conf spark.executor.cores=3 \
>>>
>>>            --conf spark.driver.memory=8092m \
>>>
>>>            --conf spark.executor.memory=8092m \
>>>
>>>            --conf spark.dynamicAllocation.enabled=true \
>>>
>>>            --conf spark.dynamicAllocation.shuffleTracking.enabled=true \
>>>
>>>            --conf spark.kubernetes.driver.container.image=${IMAGEDRIVER}
>>> \
>>>
>>>            --conf
>>> spark.kubernetes.executor.container.image=${IMAGEDRIVER} \
>>>
>>>            --conf
>>> spark.kubernetes.authenticate.driver.serviceAccountName=spark-bq \
>>>
>>>            --conf
>>> spark.driver.extraJavaOptions="-Dio.netty.tryReflectionSetAccessible=true" \
>>>
>>>            --conf
>>> spark.executor.extraJavaOptions="-Dio.netty.tryReflectionSetAccessible=true"
>>> \
>>>
>>>            $CODE_DIRECTORY_CLOUD/${APPLICATION}
>>>
>>> HTH
>>>
>>>
>>>    view my Linkedin profile
>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>
>>>
>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>
>>>
>>>
>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>> any loss, damage or destruction of data or any other property which may
>>> arise from relying on this email's technical content is explicitly
>>> disclaimed. The author will in no case be liable for any monetary damages
>>> arising from such loss, damage or destruction.
>>>
>>>
>>>
>>>
>>> On Mon, 14 Feb 2022 at 17:04, Gnana Kumar <gn...@gmail.com>
>>> wrote:
>>>
>>>> Also im using the below parameters while submitting the spark job.
>>>>
>>>> spark-submit \
>>>>   --master k8s://$K8S_SERVER \
>>>>   --deploy-mode cluster \
>>>>   --name $POD_NAME \
>>>>   --class org.apache.spark.examples.SparkPi \
>>>>   --conf spark.executor.instances=2 \
>>>>   --conf spark.kubernetes.driver.container.image=$SPARK_IMAGE \
>>>>   --conf spark.kubernetes.executor.container.image=$SPARK_IMAGE \
>>>>   --conf spark.kubernetes.container.image=$SPARK_IMAGE \
>>>>   --conf spark.kubernetes.driver.pod.name=$POD_NAME \
>>>>   --conf spark.kubernetes.namespace=spark-demo \
>>>>   --conf spark.kubernetes.container.image.pullPolicy=Never \
>>>>   --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
>>>>     $SPARK_HOME/examples/jars/spark-examples_2.12-3.2.1.jar
>>>>
>>>> On Mon, Feb 14, 2022 at 9:51 PM Gnana Kumar <gn...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi There,
>>>>>
>>>>> I have been trying to run Spark 3.2.1 in Google Cloud's Kubernetes
>>>>> Cluster version 1.19 or 1.21
>>>>>
>>>>> But I kept on getting on following error and could not proceed.
>>>>>
>>>>> Please help me resolve this issue.
>>>>>
>>>>> 22/02/14 16:00:48 INFO SparkKubernetesClientFactory: Auto-configuring
>>>>> K8S client using current context from users K8S config file
>>>>> Exception in thread "main" org.apache.spark.SparkException: Must
>>>>> specify the driver container image
>>>>>         at
>>>>> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.$anonfun$driverContainerImage$1(BasicDriverFeatureStep.scala:45)
>>>>>         at scala.Option.getOrElse(Option.scala:189)
>>>>>         at
>>>>> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.<init>(BasicDriverFeatureStep.scala:45)
>>>>>         at
>>>>> org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.buildFromFeatures(KubernetesDriverBuilder.scala:46)
>>>>>         at
>>>>> org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:106)
>>>>>         at
>>>>> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$4(KubernetesClientApplication.scala:220)
>>>>>         at
>>>>> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$4$adapted(KubernetesClientApplication.scala:214)
>>>>>         at
>>>>> org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2713)
>>>>>         at
>>>>> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:214)
>>>>>         at
>>>>> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:186)
>>>>>         at org.apache.spark.deploy.SparkSubmit.org
>>>>> $apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:955)
>>>>>         at
>>>>> org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
>>>>>         at
>>>>> org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
>>>>>         at
>>>>> org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
>>>>>         at
>>>>> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1043)
>>>>>         at
>>>>> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1052)
>>>>>         at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>>>>>
>>>>> --
>>>>> Thanks
>>>>> Gnana
>>>>>
>>>>
>>>>
>>>> --
>>>> Thanks
>>>> Gnana
>>>>
>>>
>>
>> --
>> Thanks
>> Gnana
>>
>
>
> --
> Thanks
> Gnana
>


-- 
Thanks
Gnana


-- 
Thanks
Gnana

Re: Spark 3.2.1 in Google Kubernetes Version 1.19 or 1.21 - SparkSubmit Error

Posted by Mich Talebzadeh <mi...@gmail.com>.
Hi,

I need to arrange a class for members using GCP with Dataproc or GCP with
Kubernetes I think 🤔

Ok it is a good practice to create namespace spark for this purpose rather
than using default namespace


kubectl create namespace spark

Tell me exactly what you are trying to do?  Are you running a test script
using GKE in GCP? To run spark-submit are you using a cloud shell or a VM.
Have you installed Spark binaries there?

HTH


   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>


 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Fri, 18 Feb 2022 at 16:16, Gnana Kumar <gn...@gmail.com> wrote:

> Hi Mich
>
> I'm running spark from GCP Platform and this is the error.
>
> Exception in thread "main"
> io.fabric8.kubernetes.client.KubernetesClientException: Operation: [create]
>  for kind: [Pod]  with name: [null]  in namespace: [default]  failed.
>
> Thanks
> GK
>
>
> On Fri, Feb 18, 2022 at 12:37 AM Mich Talebzadeh <
> mich.talebzadeh@gmail.com> wrote:
>
>> Just a create directory as below on gcp storage bucket
>>
>> CODE_DIRECTORY_CLOUD="gs://spark-on-k8s/codes/"
>>
>>
>> Put your jar file there
>>
>>
>> gsutil cp /opt/spark/examples/jars/spark-examples_2.12-3.2.1.jar
>>  $CODE_DIRECTORY_CLOUD
>>
>>
>>   --conf spark.kubernetes.file.upload.path=file:///tmp \
>>           $CODE_DIRECTORY_CLOUD/spark-examples_2.12-3.2.1.jar
>>
>> Where are you running spark-submit from?
>>
>>
>> HTH
>>
>>
>>    view my Linkedin profile
>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>
>>
>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>
>>
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case be liable for any monetary damages
>> arising from such loss, damage or destruction.
>>
>>
>>
>>
>> On Thu, 17 Feb 2022 at 18:24, Gnana Kumar <gn...@gmail.com>
>> wrote:
>>
>>> Though I have created the kubernetes RBAC as per Spark site in my GKE
>>> cluster,Im getting POD NAME null error.
>>>
>>> kubectl create serviceaccount spark
>>> kubectl create clusterrolebinding spark-role --clusterrole=edit
>>> --serviceaccount=default:spark --namespace=default
>>>
>>> On Thu, Feb 17, 2022 at 11:31 PM Gnana Kumar <gn...@gmail.com>
>>> wrote:
>>>
>>>> Hi Mich
>>>>
>>>> This is the latest error I'm stuck with. Please help me resolve this
>>>> issue.
>>>>
>>>> Exception in thread "main"
>>>> io.fabric8.kubernetes.client.KubernetesClientException: Operation: [create]
>>>>  for kind: [Pod]  with name: [null]  in namespace: [default]  failed.
>>>>
>>>> ~/spark/spark-3.2.1-bin-hadoop3.2/bin/spark-submit  \
>>>>            --verbose \
>>>>            --class org.apache.spark.examples.SparkPi \
>>>>            --master k8s://${K8S_SERVER}:443 \
>>>>            --deploy-mode cluster \
>>>>            --name sparkBQ \
>>>>            --conf spark.kubernetes.namespace=$NAMESPACE \
>>>>            --conf spark.network.timeout=300 \
>>>>            --conf spark.executor.instances=3 \
>>>>            --conf spark.kubernetes.allocation.batch.size=3 \
>>>>            --conf spark.kubernetes.allocation.batch.delay=1 \
>>>>            --conf spark.driver.cores=3 \
>>>>            --conf spark.executor.cores=3 \
>>>>            --conf spark.driver.memory=8092m \
>>>>            --conf spark.executor.memory=8092m \
>>>>            --conf spark.dynamicAllocation.enabled=true \
>>>>            --conf spark.dynamicAllocation.shuffleTracking.enabled=true \
>>>>            --conf spark.kubernetes.driver.pod.name=spark-pi-driver \
>>>>            --conf spark.kubernetes.driver.container.image=
>>>> ${SPARK_IMAGE} \
>>>>            --conf spark.kubernetes.executor.container.image=
>>>> ${SPARK_IMAGE} \
>>>>
>>>>            --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
>>>>            --conf spark.driver.extraJavaOptions=
>>>> "-Dio.netty.tryReflectionSetAccessible=true" \
>>>>            --conf spark.executor.extraJavaOptions=
>>>> "-Dio.netty.tryReflectionSetAccessible=true"\
>>>>            --conf spark.kubernetes.file.upload.path=file:///tmp \
>>>>
>>>>            local:///opt/spark/examples/jars/spark-examples_2.12-3.2.1.jar
>>>>
>>>> Thanks
>>>> GK
>>>>
>>>> On Thu, Feb 17, 2022 at 6:55 PM Mich Talebzadeh <
>>>> mich.talebzadeh@gmail.com> wrote:
>>>>
>>>>> Hi Gnana,
>>>>>
>>>>> That JAR file /home/gnana_kumar123/spark/spark-3.2.1-
>>>>> bin-hadoop3.2/examples/jars/spark-examples_2.12-3.2.1.jar, is not
>>>>> visible to the GKE cluster such that all nodes can read it. I suggest that
>>>>> you put it on gs:// bucket in GCP and access it from there.
>>>>>
>>>>>
>>>>> HTH
>>>>>
>>>>>
>>>>>    view my Linkedin profile
>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>>>
>>>>>
>>>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>>>
>>>>>
>>>>>
>>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>>>> any loss, damage or destruction of data or any other property which may
>>>>> arise from relying on this email's technical content is explicitly
>>>>> disclaimed. The author will in no case be liable for any monetary damages
>>>>> arising from such loss, damage or destruction.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Thu, 17 Feb 2022 at 13:05, Gnana Kumar <gn...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi There,
>>>>>>
>>>>>> I'm getting below error though I pass --class and --jars values
>>>>>> while submitting a spark job through Spark-Submit.
>>>>>> Please help.
>>>>>>
>>>>>> Exception in thread "main" org.apache.spark.SparkException: Failed to
>>>>>> get main class in JAR with error 'File file:/home/gnana_kumar123/spark/
>>>>>>  does not exist'.  Please specify one with --class.
>>>>>>         at
>>>>>> org.apache.spark.deploy.SparkSubmit.error(SparkSubmit.scala:972)
>>>>>>         at
>>>>>> org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:486)
>>>>>>         at org.apache.spark.deploy.SparkSubmit.org
>>>>>> $apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:898)
>>>>>>         at
>>>>>> org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
>>>>>>         at
>>>>>> org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
>>>>>>         at
>>>>>> org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
>>>>>>         at
>>>>>> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1043)
>>>>>>         at
>>>>>> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1052)
>>>>>>         at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>>>>>>
>>>>>>
>>>>>>
>>>>>> ~/spark/spark-3.2.1-bin-hadoop3.2/bin/spark-submit \
>>>>>>            --master k8s://${K8S_SERVER}:443 \
>>>>>>            --deploy-mode cluster \
>>>>>>            --name sparkBQ \
>>>>>>            --conf spark.kubernetes.namespace=$NAMESPACE \
>>>>>>            --conf spark.network.timeout=300 \
>>>>>>            --conf spark.executor.instances=3 \
>>>>>>            --conf spark.kubernetes.allocation.batch.size=3 \
>>>>>>            --conf spark.kubernetes.allocation.batch.delay=1 \
>>>>>>            --conf spark.driver.cores=3 \
>>>>>>            --conf spark.executor.cores=3 \
>>>>>>            --conf spark.driver.memory=8092m \
>>>>>>            --conf spark.executor.memory=8092m \
>>>>>>            --conf spark.dynamicAllocation.enabled=true \
>>>>>>
>>>>>>            --conf spark.dynamicAllocation.shuffleTracking.enabled=true \
>>>>>>            --conf spark.kubernetes.driver.container.image=
>>>>>> ${SPARK_IMAGE} \
>>>>>>            --conf spark.kubernetes.executor.container.image=
>>>>>> ${SPARK_IMAGE} \
>>>>>>
>>>>>>            --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
>>>>>>            --conf spark.driver.extraJavaOptions=
>>>>>> "-Dio.netty.tryReflectionSetAccessible=true" \
>>>>>>            --conf spark.executor.extraJavaOptions=
>>>>>> "-Dio.netty.tryReflectionSetAccessible=true" \
>>>>>>            --class org.apache.spark.examples.SparkPi \
>>>>>>
>>>>>>            --jars /home/gnana_kumar123/spark/spark-3.2.1-bin-hadoop3.2/examples/jars/spark-examples_2.12-3.2.1.jar
>>>>>>
>>>>>> Thanks
>>>>>> GK
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Feb 16, 2022 at 11:11 PM Gnana Kumar <
>>>>>> gnana.kumar123@gmail.com> wrote:
>>>>>>
>>>>>>> Hi Mich
>>>>>>>
>>>>>>> Also I would like to run Spark nodes ( Master and Worker nodes in
>>>>>>> Kubernetes) and then run my Java Spark application from a JAR file.
>>>>>>>
>>>>>>> Can you please let me know how to specify the JAR file and the MAIN
>>>>>>> class.
>>>>>>>
>>>>>>> Thanks
>>>>>>> GK
>>>>>>>
>>>>>>> On Wed, Feb 16, 2022 at 10:36 PM Gnana Kumar <
>>>>>>> gnana.kumar123@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi Mich,
>>>>>>>>
>>>>>>>> I have built the image using the Dockerfile present
>>>>>>>> in spark-3.2.1-bin-hadoop3.2.tgz.
>>>>>>>>
>>>>>>>> Also I have pushed the same image to my docker hub account ie.
>>>>>>>> docker.io/gnanakumar123/spark3.2.1:latest
>>>>>>>>
>>>>>>>> I believe spark submit can pull image from docker hub when I run
>>>>>>>> from GKE's Cloud Shell. Please confirm.
>>>>>>>>
>>>>>>>> Below is the command I'm running.
>>>>>>>>
>>>>>>>> ./spark-submit \
>>>>>>>>   --master k8s://$K8S_SERVER \
>>>>>>>>   --deploy-mode cluster \
>>>>>>>>   --name spark-driver-pod \
>>>>>>>>   --class org.apache.spark.examples.SparkPi \
>>>>>>>>   --conf spark.executor.instances=2 \
>>>>>>>>   --conf spark.kubernetes.driver.container.image=
>>>>>>>> docker.io/gnanakumar123/spark3.2.1:latest \
>>>>>>>>   --conf spark.kubernetes.executor.container.image=
>>>>>>>> docker.io/gnanakumar123/spark3.2.1:latest \
>>>>>>>>   --conf spark.kubernetes.container.image=
>>>>>>>> docker.io/gnanakumar123/spark3.2.1:latest \
>>>>>>>>   --conf spark.kubernetes.driver.pod.name=spark-driver-pod \
>>>>>>>>   --conf spark.kubernetes.namespace=spark-demo \
>>>>>>>>   --conf spark.kubernetes.container.image.pullPolicy=Never \
>>>>>>>>   --conf
>>>>>>>> spark.kubernetes.authenticate.driver.serviceAccountName=spark \
>>>>>>>>     $SPARK_HOME/examples/jars/spark-examples_2.12-3.2.1.jar
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> GK
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Feb 14, 2022 at 10:50 PM Mich Talebzadeh <
>>>>>>>> mich.talebzadeh@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> It is complaining about the missing driver container image. Does
>>>>>>>>> $SPARK_IMAGE point to a valid image in the GCP container registry?
>>>>>>>>>
>>>>>>>>> Example for a docker image for PySpark
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> IMAGEDRIVER="eu.gcr.io/
>>>>>>>>> <PROJRECT>/spark-py:3.1.1-scala_2.12-8-jre-slim-buster-java8PlusPackages"
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>         spark-submit --verbose \
>>>>>>>>>
>>>>>>>>>            --properties-file ${property_file} \
>>>>>>>>>
>>>>>>>>>            --master k8s://https://$KUBERNETES_MASTER_IP:443 \
>>>>>>>>>
>>>>>>>>>            --deploy-mode cluster \
>>>>>>>>>
>>>>>>>>>            --name sparkBQ \
>>>>>>>>>
>>>>>>>>>            --py-files $CODE_DIRECTORY_CLOUD/spark_on_gke.zip \
>>>>>>>>>
>>>>>>>>>            --conf spark.kubernetes.namespace=$NAMESPACE \
>>>>>>>>>
>>>>>>>>>            --conf spark.network.timeout=300 \
>>>>>>>>>
>>>>>>>>>            --conf spark.executor.instances=$NEXEC \
>>>>>>>>>
>>>>>>>>>            --conf spark.kubernetes.allocation.batch.size=3 \
>>>>>>>>>
>>>>>>>>>            --conf spark.kubernetes.allocation.batch.delay=1 \
>>>>>>>>>
>>>>>>>>>            --conf spark.driver.cores=3 \
>>>>>>>>>
>>>>>>>>>            --conf spark.executor.cores=3 \
>>>>>>>>>
>>>>>>>>>            --conf spark.driver.memory=8092m \
>>>>>>>>>
>>>>>>>>>            --conf spark.executor.memory=8092m \
>>>>>>>>>
>>>>>>>>>            --conf spark.dynamicAllocation.enabled=true \
>>>>>>>>>
>>>>>>>>>            --conf
>>>>>>>>> spark.dynamicAllocation.shuffleTracking.enabled=true \
>>>>>>>>>
>>>>>>>>>            --conf
>>>>>>>>> spark.kubernetes.driver.container.image=${IMAGEDRIVER} \
>>>>>>>>>
>>>>>>>>>            --conf
>>>>>>>>> spark.kubernetes.executor.container.image=${IMAGEDRIVER} \
>>>>>>>>>
>>>>>>>>>            --conf
>>>>>>>>> spark.kubernetes.authenticate.driver.serviceAccountName=spark-bq \
>>>>>>>>>
>>>>>>>>>            --conf
>>>>>>>>> spark.driver.extraJavaOptions="-Dio.netty.tryReflectionSetAccessible=true" \
>>>>>>>>>
>>>>>>>>>            --conf
>>>>>>>>> spark.executor.extraJavaOptions="-Dio.netty.tryReflectionSetAccessible=true"
>>>>>>>>> \
>>>>>>>>>
>>>>>>>>>            $CODE_DIRECTORY_CLOUD/${APPLICATION}
>>>>>>>>>
>>>>>>>>> HTH
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>    view my Linkedin profile
>>>>>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility
>>>>>>>>> for any loss, damage or destruction of data or any other property which may
>>>>>>>>> arise from relying on this email's technical content is explicitly
>>>>>>>>> disclaimed. The author will in no case be liable for any monetary damages
>>>>>>>>> arising from such loss, damage or destruction.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, 14 Feb 2022 at 17:04, Gnana Kumar <
>>>>>>>>> gnana.kumar123@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Also im using the below parameters while submitting the spark job.
>>>>>>>>>>
>>>>>>>>>> spark-submit \
>>>>>>>>>>   --master k8s://$K8S_SERVER \
>>>>>>>>>>   --deploy-mode cluster \
>>>>>>>>>>   --name $POD_NAME \
>>>>>>>>>>   --class org.apache.spark.examples.SparkPi \
>>>>>>>>>>   --conf spark.executor.instances=2 \
>>>>>>>>>>   --conf spark.kubernetes.driver.container.image=$SPARK_IMAGE \
>>>>>>>>>>   --conf spark.kubernetes.executor.container.image=$SPARK_IMAGE \
>>>>>>>>>>   --conf spark.kubernetes.container.image=$SPARK_IMAGE \
>>>>>>>>>>   --conf spark.kubernetes.driver.pod.name=$POD_NAME \
>>>>>>>>>>   --conf spark.kubernetes.namespace=spark-demo \
>>>>>>>>>>   --conf spark.kubernetes.container.image.pullPolicy=Never \
>>>>>>>>>>   --conf
>>>>>>>>>> spark.kubernetes.authenticate.driver.serviceAccountName=spark \
>>>>>>>>>>     $SPARK_HOME/examples/jars/spark-examples_2.12-3.2.1.jar
>>>>>>>>>>
>>>>>>>>>> On Mon, Feb 14, 2022 at 9:51 PM Gnana Kumar <
>>>>>>>>>> gnana.kumar123@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi There,
>>>>>>>>>>>
>>>>>>>>>>> I have been trying to run Spark 3.2.1 in Google Cloud's
>>>>>>>>>>> Kubernetes Cluster version 1.19 or 1.21
>>>>>>>>>>>
>>>>>>>>>>> But I kept on getting on following error and could not proceed.
>>>>>>>>>>>
>>>>>>>>>>> Please help me resolve this issue.
>>>>>>>>>>>
>>>>>>>>>>> 22/02/14 16:00:48 INFO SparkKubernetesClientFactory:
>>>>>>>>>>> Auto-configuring K8S client using current context from users K8S config file
>>>>>>>>>>> Exception in thread "main" org.apache.spark.SparkException: Must
>>>>>>>>>>> specify the driver container image
>>>>>>>>>>>         at
>>>>>>>>>>> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.$anonfun$driverContainerImage$1(BasicDriverFeatureStep.scala:45)
>>>>>>>>>>>         at scala.Option.getOrElse(Option.scala:189)
>>>>>>>>>>>         at
>>>>>>>>>>> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.<init>(BasicDriverFeatureStep.scala:45)
>>>>>>>>>>>         at
>>>>>>>>>>> org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.buildFromFeatures(KubernetesDriverBuilder.scala:46)
>>>>>>>>>>>         at
>>>>>>>>>>> org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:106)
>>>>>>>>>>>         at
>>>>>>>>>>> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$4(KubernetesClientApplication.scala:220)
>>>>>>>>>>>         at
>>>>>>>>>>> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$4$adapted(KubernetesClientApplication.scala:214)
>>>>>>>>>>>         at
>>>>>>>>>>> org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2713)
>>>>>>>>>>>         at
>>>>>>>>>>> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:214)
>>>>>>>>>>>         at
>>>>>>>>>>> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:186)
>>>>>>>>>>>         at org.apache.spark.deploy.SparkSubmit.org
>>>>>>>>>>> $apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:955)
>>>>>>>>>>>         at
>>>>>>>>>>> org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
>>>>>>>>>>>         at
>>>>>>>>>>> org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
>>>>>>>>>>>         at
>>>>>>>>>>> org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
>>>>>>>>>>>         at
>>>>>>>>>>> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1043)
>>>>>>>>>>>         at
>>>>>>>>>>> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1052)
>>>>>>>>>>>         at
>>>>>>>>>>> org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Thanks
>>>>>>>>>>> Gnana
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Thanks
>>>>>>>>>> Gnana
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Thanks
>>>>>>>> Gnana
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Thanks
>>>>>>> Gnana
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Thanks
>>>>>> Gnana
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Thanks
>>>>>> Gnana
>>>>>>
>>>>>
>>>>
>>>> --
>>>> Thanks
>>>> Gnana
>>>>
>>>
>>>
>>> --
>>> Thanks
>>> Gnana
>>>
>>
>
> --
> Thanks
> Gnana
>

Re: Spark 3.2.1 in Google Kubernetes Version 1.19 or 1.21 - SparkSubmit Error

Posted by Gnana Kumar <gn...@gmail.com>.
Hi Mich

I'm running spark from GCP Platform and this is the error.

Exception in thread "main"
io.fabric8.kubernetes.client.KubernetesClientException: Operation: [create]
 for kind: [Pod]  with name: [null]  in namespace: [default]  failed.

Thanks
GK


On Fri, Feb 18, 2022 at 12:37 AM Mich Talebzadeh <mi...@gmail.com>
wrote:

> Just a create directory as below on gcp storage bucket
>
> CODE_DIRECTORY_CLOUD="gs://spark-on-k8s/codes/"
>
>
> Put your jar file there
>
>
> gsutil cp /opt/spark/examples/jars/spark-examples_2.12-3.2.1.jar
>  $CODE_DIRECTORY_CLOUD
>
>
>   --conf spark.kubernetes.file.upload.path=file:///tmp \
>           $CODE_DIRECTORY_CLOUD/spark-examples_2.12-3.2.1.jar
>
> Where are you running spark-submit from?
>
>
> HTH
>
>
>    view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Thu, 17 Feb 2022 at 18:24, Gnana Kumar <gn...@gmail.com>
> wrote:
>
>> Though I have created the kubernetes RBAC as per Spark site in my GKE
>> cluster,Im getting POD NAME null error.
>>
>> kubectl create serviceaccount spark
>> kubectl create clusterrolebinding spark-role --clusterrole=edit
>> --serviceaccount=default:spark --namespace=default
>>
>> On Thu, Feb 17, 2022 at 11:31 PM Gnana Kumar <gn...@gmail.com>
>> wrote:
>>
>>> Hi Mich
>>>
>>> This is the latest error I'm stuck with. Please help me resolve this
>>> issue.
>>>
>>> Exception in thread "main"
>>> io.fabric8.kubernetes.client.KubernetesClientException: Operation: [create]
>>>  for kind: [Pod]  with name: [null]  in namespace: [default]  failed.
>>>
>>> ~/spark/spark-3.2.1-bin-hadoop3.2/bin/spark-submit  \
>>>            --verbose \
>>>            --class org.apache.spark.examples.SparkPi \
>>>            --master k8s://${K8S_SERVER}:443 \
>>>            --deploy-mode cluster \
>>>            --name sparkBQ \
>>>            --conf spark.kubernetes.namespace=$NAMESPACE \
>>>            --conf spark.network.timeout=300 \
>>>            --conf spark.executor.instances=3 \
>>>            --conf spark.kubernetes.allocation.batch.size=3 \
>>>            --conf spark.kubernetes.allocation.batch.delay=1 \
>>>            --conf spark.driver.cores=3 \
>>>            --conf spark.executor.cores=3 \
>>>            --conf spark.driver.memory=8092m \
>>>            --conf spark.executor.memory=8092m \
>>>            --conf spark.dynamicAllocation.enabled=true \
>>>            --conf spark.dynamicAllocation.shuffleTracking.enabled=true \
>>>            --conf spark.kubernetes.driver.pod.name=spark-pi-driver \
>>>            --conf spark.kubernetes.driver.container.image=${SPARK_IMAGE}
>>>  \
>>>            --conf spark.kubernetes.executor.container.image=
>>> ${SPARK_IMAGE} \
>>>
>>>            --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
>>>            --conf spark.driver.extraJavaOptions=
>>> "-Dio.netty.tryReflectionSetAccessible=true" \
>>>            --conf spark.executor.extraJavaOptions=
>>> "-Dio.netty.tryReflectionSetAccessible=true"\
>>>            --conf spark.kubernetes.file.upload.path=file:///tmp \
>>>            local:///opt/spark/examples/jars/spark-examples_2.12-3.2.1.jar
>>>
>>> Thanks
>>> GK
>>>
>>> On Thu, Feb 17, 2022 at 6:55 PM Mich Talebzadeh <
>>> mich.talebzadeh@gmail.com> wrote:
>>>
>>>> Hi Gnana,
>>>>
>>>> That JAR file /home/gnana_kumar123/spark/spark-3.2.1-
>>>> bin-hadoop3.2/examples/jars/spark-examples_2.12-3.2.1.jar, is not
>>>> visible to the GKE cluster such that all nodes can read it. I suggest that
>>>> you put it on gs:// bucket in GCP and access it from there.
>>>>
>>>>
>>>> HTH
>>>>
>>>>
>>>>    view my Linkedin profile
>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>>
>>>>
>>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>>
>>>>
>>>>
>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>>> any loss, damage or destruction of data or any other property which may
>>>> arise from relying on this email's technical content is explicitly
>>>> disclaimed. The author will in no case be liable for any monetary damages
>>>> arising from such loss, damage or destruction.
>>>>
>>>>
>>>>
>>>>
>>>> On Thu, 17 Feb 2022 at 13:05, Gnana Kumar <gn...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi There,
>>>>>
>>>>> I'm getting below error though I pass --class and --jars values
>>>>> while submitting a spark job through Spark-Submit.
>>>>> Please help.
>>>>>
>>>>> Exception in thread "main" org.apache.spark.SparkException: Failed to
>>>>> get main class in JAR with error 'File file:/home/gnana_kumar123/spark/
>>>>>  does not exist'.  Please specify one with --class.
>>>>>         at
>>>>> org.apache.spark.deploy.SparkSubmit.error(SparkSubmit.scala:972)
>>>>>         at
>>>>> org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:486)
>>>>>         at org.apache.spark.deploy.SparkSubmit.org
>>>>> $apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:898)
>>>>>         at
>>>>> org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
>>>>>         at
>>>>> org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
>>>>>         at
>>>>> org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
>>>>>         at
>>>>> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1043)
>>>>>         at
>>>>> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1052)
>>>>>         at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>>>>>
>>>>>
>>>>>
>>>>> ~/spark/spark-3.2.1-bin-hadoop3.2/bin/spark-submit \
>>>>>            --master k8s://${K8S_SERVER}:443 \
>>>>>            --deploy-mode cluster \
>>>>>            --name sparkBQ \
>>>>>            --conf spark.kubernetes.namespace=$NAMESPACE \
>>>>>            --conf spark.network.timeout=300 \
>>>>>            --conf spark.executor.instances=3 \
>>>>>            --conf spark.kubernetes.allocation.batch.size=3 \
>>>>>            --conf spark.kubernetes.allocation.batch.delay=1 \
>>>>>            --conf spark.driver.cores=3 \
>>>>>            --conf spark.executor.cores=3 \
>>>>>            --conf spark.driver.memory=8092m \
>>>>>            --conf spark.executor.memory=8092m \
>>>>>            --conf spark.dynamicAllocation.enabled=true \
>>>>>
>>>>>            --conf spark.dynamicAllocation.shuffleTracking.enabled=true \
>>>>>            --conf spark.kubernetes.driver.container.image=
>>>>> ${SPARK_IMAGE} \
>>>>>            --conf spark.kubernetes.executor.container.image=
>>>>> ${SPARK_IMAGE} \
>>>>>
>>>>>            --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
>>>>>            --conf spark.driver.extraJavaOptions=
>>>>> "-Dio.netty.tryReflectionSetAccessible=true" \
>>>>>            --conf spark.executor.extraJavaOptions=
>>>>> "-Dio.netty.tryReflectionSetAccessible=true" \
>>>>>            --class org.apache.spark.examples.SparkPi \
>>>>>
>>>>>            --jars /home/gnana_kumar123/spark/spark-3.2.1-bin-hadoop3.2/examples/jars/spark-examples_2.12-3.2.1.jar
>>>>>
>>>>> Thanks
>>>>> GK
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Feb 16, 2022 at 11:11 PM Gnana Kumar <gn...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Mich
>>>>>>
>>>>>> Also I would like to run Spark nodes ( Master and Worker nodes in
>>>>>> Kubernetes) and then run my Java Spark application from a JAR file.
>>>>>>
>>>>>> Can you please let me know how to specify the JAR file and the MAIN
>>>>>> class.
>>>>>>
>>>>>> Thanks
>>>>>> GK
>>>>>>
>>>>>> On Wed, Feb 16, 2022 at 10:36 PM Gnana Kumar <
>>>>>> gnana.kumar123@gmail.com> wrote:
>>>>>>
>>>>>>> Hi Mich,
>>>>>>>
>>>>>>> I have built the image using the Dockerfile present
>>>>>>> in spark-3.2.1-bin-hadoop3.2.tgz.
>>>>>>>
>>>>>>> Also I have pushed the same image to my docker hub account ie.
>>>>>>> docker.io/gnanakumar123/spark3.2.1:latest
>>>>>>>
>>>>>>> I believe spark submit can pull image from docker hub when I run
>>>>>>> from GKE's Cloud Shell. Please confirm.
>>>>>>>
>>>>>>> Below is the command I'm running.
>>>>>>>
>>>>>>> ./spark-submit \
>>>>>>>   --master k8s://$K8S_SERVER \
>>>>>>>   --deploy-mode cluster \
>>>>>>>   --name spark-driver-pod \
>>>>>>>   --class org.apache.spark.examples.SparkPi \
>>>>>>>   --conf spark.executor.instances=2 \
>>>>>>>   --conf spark.kubernetes.driver.container.image=
>>>>>>> docker.io/gnanakumar123/spark3.2.1:latest \
>>>>>>>   --conf spark.kubernetes.executor.container.image=
>>>>>>> docker.io/gnanakumar123/spark3.2.1:latest \
>>>>>>>   --conf spark.kubernetes.container.image=
>>>>>>> docker.io/gnanakumar123/spark3.2.1:latest \
>>>>>>>   --conf spark.kubernetes.driver.pod.name=spark-driver-pod \
>>>>>>>   --conf spark.kubernetes.namespace=spark-demo \
>>>>>>>   --conf spark.kubernetes.container.image.pullPolicy=Never \
>>>>>>>   --conf
>>>>>>> spark.kubernetes.authenticate.driver.serviceAccountName=spark \
>>>>>>>     $SPARK_HOME/examples/jars/spark-examples_2.12-3.2.1.jar
>>>>>>>
>>>>>>> Thanks
>>>>>>> GK
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Feb 14, 2022 at 10:50 PM Mich Talebzadeh <
>>>>>>> mich.talebzadeh@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi
>>>>>>>>
>>>>>>>>
>>>>>>>> It is complaining about the missing driver container image. Does
>>>>>>>> $SPARK_IMAGE point to a valid image in the GCP container registry?
>>>>>>>>
>>>>>>>> Example for a docker image for PySpark
>>>>>>>>
>>>>>>>>
>>>>>>>> IMAGEDRIVER="eu.gcr.io/
>>>>>>>> <PROJRECT>/spark-py:3.1.1-scala_2.12-8-jre-slim-buster-java8PlusPackages"
>>>>>>>>
>>>>>>>>
>>>>>>>>         spark-submit --verbose \
>>>>>>>>
>>>>>>>>            --properties-file ${property_file} \
>>>>>>>>
>>>>>>>>            --master k8s://https://$KUBERNETES_MASTER_IP:443 \
>>>>>>>>
>>>>>>>>            --deploy-mode cluster \
>>>>>>>>
>>>>>>>>            --name sparkBQ \
>>>>>>>>
>>>>>>>>            --py-files $CODE_DIRECTORY_CLOUD/spark_on_gke.zip \
>>>>>>>>
>>>>>>>>            --conf spark.kubernetes.namespace=$NAMESPACE \
>>>>>>>>
>>>>>>>>            --conf spark.network.timeout=300 \
>>>>>>>>
>>>>>>>>            --conf spark.executor.instances=$NEXEC \
>>>>>>>>
>>>>>>>>            --conf spark.kubernetes.allocation.batch.size=3 \
>>>>>>>>
>>>>>>>>            --conf spark.kubernetes.allocation.batch.delay=1 \
>>>>>>>>
>>>>>>>>            --conf spark.driver.cores=3 \
>>>>>>>>
>>>>>>>>            --conf spark.executor.cores=3 \
>>>>>>>>
>>>>>>>>            --conf spark.driver.memory=8092m \
>>>>>>>>
>>>>>>>>            --conf spark.executor.memory=8092m \
>>>>>>>>
>>>>>>>>            --conf spark.dynamicAllocation.enabled=true \
>>>>>>>>
>>>>>>>>            --conf
>>>>>>>> spark.dynamicAllocation.shuffleTracking.enabled=true \
>>>>>>>>
>>>>>>>>            --conf
>>>>>>>> spark.kubernetes.driver.container.image=${IMAGEDRIVER} \
>>>>>>>>
>>>>>>>>            --conf
>>>>>>>> spark.kubernetes.executor.container.image=${IMAGEDRIVER} \
>>>>>>>>
>>>>>>>>            --conf
>>>>>>>> spark.kubernetes.authenticate.driver.serviceAccountName=spark-bq \
>>>>>>>>
>>>>>>>>            --conf
>>>>>>>> spark.driver.extraJavaOptions="-Dio.netty.tryReflectionSetAccessible=true" \
>>>>>>>>
>>>>>>>>            --conf
>>>>>>>> spark.executor.extraJavaOptions="-Dio.netty.tryReflectionSetAccessible=true"
>>>>>>>> \
>>>>>>>>
>>>>>>>>            $CODE_DIRECTORY_CLOUD/${APPLICATION}
>>>>>>>>
>>>>>>>> HTH
>>>>>>>>
>>>>>>>>
>>>>>>>>    view my Linkedin profile
>>>>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>>>>>>
>>>>>>>>
>>>>>>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility
>>>>>>>> for any loss, damage or destruction of data or any other property which may
>>>>>>>> arise from relying on this email's technical content is explicitly
>>>>>>>> disclaimed. The author will in no case be liable for any monetary damages
>>>>>>>> arising from such loss, damage or destruction.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, 14 Feb 2022 at 17:04, Gnana Kumar <gn...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Also im using the below parameters while submitting the spark job.
>>>>>>>>>
>>>>>>>>> spark-submit \
>>>>>>>>>   --master k8s://$K8S_SERVER \
>>>>>>>>>   --deploy-mode cluster \
>>>>>>>>>   --name $POD_NAME \
>>>>>>>>>   --class org.apache.spark.examples.SparkPi \
>>>>>>>>>   --conf spark.executor.instances=2 \
>>>>>>>>>   --conf spark.kubernetes.driver.container.image=$SPARK_IMAGE \
>>>>>>>>>   --conf spark.kubernetes.executor.container.image=$SPARK_IMAGE \
>>>>>>>>>   --conf spark.kubernetes.container.image=$SPARK_IMAGE \
>>>>>>>>>   --conf spark.kubernetes.driver.pod.name=$POD_NAME \
>>>>>>>>>   --conf spark.kubernetes.namespace=spark-demo \
>>>>>>>>>   --conf spark.kubernetes.container.image.pullPolicy=Never \
>>>>>>>>>   --conf
>>>>>>>>> spark.kubernetes.authenticate.driver.serviceAccountName=spark \
>>>>>>>>>     $SPARK_HOME/examples/jars/spark-examples_2.12-3.2.1.jar
>>>>>>>>>
>>>>>>>>> On Mon, Feb 14, 2022 at 9:51 PM Gnana Kumar <
>>>>>>>>> gnana.kumar123@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi There,
>>>>>>>>>>
>>>>>>>>>> I have been trying to run Spark 3.2.1 in Google Cloud's
>>>>>>>>>> Kubernetes Cluster version 1.19 or 1.21
>>>>>>>>>>
>>>>>>>>>> But I kept on getting on following error and could not proceed.
>>>>>>>>>>
>>>>>>>>>> Please help me resolve this issue.
>>>>>>>>>>
>>>>>>>>>> 22/02/14 16:00:48 INFO SparkKubernetesClientFactory:
>>>>>>>>>> Auto-configuring K8S client using current context from users K8S config file
>>>>>>>>>> Exception in thread "main" org.apache.spark.SparkException: Must
>>>>>>>>>> specify the driver container image
>>>>>>>>>>         at
>>>>>>>>>> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.$anonfun$driverContainerImage$1(BasicDriverFeatureStep.scala:45)
>>>>>>>>>>         at scala.Option.getOrElse(Option.scala:189)
>>>>>>>>>>         at
>>>>>>>>>> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.<init>(BasicDriverFeatureStep.scala:45)
>>>>>>>>>>         at
>>>>>>>>>> org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.buildFromFeatures(KubernetesDriverBuilder.scala:46)
>>>>>>>>>>         at
>>>>>>>>>> org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:106)
>>>>>>>>>>         at
>>>>>>>>>> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$4(KubernetesClientApplication.scala:220)
>>>>>>>>>>         at
>>>>>>>>>> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$4$adapted(KubernetesClientApplication.scala:214)
>>>>>>>>>>         at
>>>>>>>>>> org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2713)
>>>>>>>>>>         at
>>>>>>>>>> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:214)
>>>>>>>>>>         at
>>>>>>>>>> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:186)
>>>>>>>>>>         at org.apache.spark.deploy.SparkSubmit.org
>>>>>>>>>> $apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:955)
>>>>>>>>>>         at
>>>>>>>>>> org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
>>>>>>>>>>         at
>>>>>>>>>> org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
>>>>>>>>>>         at
>>>>>>>>>> org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
>>>>>>>>>>         at
>>>>>>>>>> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1043)
>>>>>>>>>>         at
>>>>>>>>>> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1052)
>>>>>>>>>>         at
>>>>>>>>>> org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Thanks
>>>>>>>>>> Gnana
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Thanks
>>>>>>>>> Gnana
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Thanks
>>>>>>> Gnana
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Thanks
>>>>>> Gnana
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Thanks
>>>>> Gnana
>>>>>
>>>>>
>>>>> --
>>>>> Thanks
>>>>> Gnana
>>>>>
>>>>
>>>
>>> --
>>> Thanks
>>> Gnana
>>>
>>
>>
>> --
>> Thanks
>> Gnana
>>
>

-- 
Thanks
Gnana

Re: Spark 3.2.1 in Google Kubernetes Version 1.19 or 1.21 - SparkSubmit Error

Posted by Mich Talebzadeh <mi...@gmail.com>.
Just a create directory as below on gcp storage bucket

CODE_DIRECTORY_CLOUD="gs://spark-on-k8s/codes/"


Put your jar file there


gsutil cp /opt/spark/examples/jars/spark-examples_2.12-3.2.1.jar
 $CODE_DIRECTORY_CLOUD


  --conf spark.kubernetes.file.upload.path=file:///tmp \
          $CODE_DIRECTORY_CLOUD/spark-examples_2.12-3.2.1.jar

Where are you running spark-submit from?


HTH


   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>


 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Thu, 17 Feb 2022 at 18:24, Gnana Kumar <gn...@gmail.com> wrote:

> Though I have created the kubernetes RBAC as per Spark site in my GKE
> cluster,Im getting POD NAME null error.
>
> kubectl create serviceaccount spark
> kubectl create clusterrolebinding spark-role --clusterrole=edit
> --serviceaccount=default:spark --namespace=default
>
> On Thu, Feb 17, 2022 at 11:31 PM Gnana Kumar <gn...@gmail.com>
> wrote:
>
>> Hi Mich
>>
>> This is the latest error I'm stuck with. Please help me resolve this
>> issue.
>>
>> Exception in thread "main"
>> io.fabric8.kubernetes.client.KubernetesClientException: Operation: [create]
>>  for kind: [Pod]  with name: [null]  in namespace: [default]  failed.
>>
>> ~/spark/spark-3.2.1-bin-hadoop3.2/bin/spark-submit  \
>>            --verbose \
>>            --class org.apache.spark.examples.SparkPi \
>>            --master k8s://${K8S_SERVER}:443 \
>>            --deploy-mode cluster \
>>            --name sparkBQ \
>>            --conf spark.kubernetes.namespace=$NAMESPACE \
>>            --conf spark.network.timeout=300 \
>>            --conf spark.executor.instances=3 \
>>            --conf spark.kubernetes.allocation.batch.size=3 \
>>            --conf spark.kubernetes.allocation.batch.delay=1 \
>>            --conf spark.driver.cores=3 \
>>            --conf spark.executor.cores=3 \
>>            --conf spark.driver.memory=8092m \
>>            --conf spark.executor.memory=8092m \
>>            --conf spark.dynamicAllocation.enabled=true \
>>            --conf spark.dynamicAllocation.shuffleTracking.enabled=true \
>>            --conf spark.kubernetes.driver.pod.name=spark-pi-driver \
>>            --conf spark.kubernetes.driver.container.image=${SPARK_IMAGE}
>>  \
>>            --conf spark.kubernetes.executor.container.image=
>> ${SPARK_IMAGE} \
>>
>>            --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
>>            --conf spark.driver.extraJavaOptions=
>> "-Dio.netty.tryReflectionSetAccessible=true" \
>>            --conf spark.executor.extraJavaOptions=
>> "-Dio.netty.tryReflectionSetAccessible=true"\
>>            --conf spark.kubernetes.file.upload.path=file:///tmp \
>>            local:///opt/spark/examples/jars/spark-examples_2.12-3.2.1.jar
>>
>> Thanks
>> GK
>>
>> On Thu, Feb 17, 2022 at 6:55 PM Mich Talebzadeh <
>> mich.talebzadeh@gmail.com> wrote:
>>
>>> Hi Gnana,
>>>
>>> That JAR file /home/gnana_kumar123/spark/spark-3.2.1-
>>> bin-hadoop3.2/examples/jars/spark-examples_2.12-3.2.1.jar, is not
>>> visible to the GKE cluster such that all nodes can read it. I suggest that
>>> you put it on gs:// bucket in GCP and access it from there.
>>>
>>>
>>> HTH
>>>
>>>
>>>    view my Linkedin profile
>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>
>>>
>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>
>>>
>>>
>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>> any loss, damage or destruction of data or any other property which may
>>> arise from relying on this email's technical content is explicitly
>>> disclaimed. The author will in no case be liable for any monetary damages
>>> arising from such loss, damage or destruction.
>>>
>>>
>>>
>>>
>>> On Thu, 17 Feb 2022 at 13:05, Gnana Kumar <gn...@gmail.com>
>>> wrote:
>>>
>>>> Hi There,
>>>>
>>>> I'm getting below error though I pass --class and --jars values
>>>> while submitting a spark job through Spark-Submit.
>>>> Please help.
>>>>
>>>> Exception in thread "main" org.apache.spark.SparkException: Failed to
>>>> get main class in JAR with error 'File file:/home/gnana_kumar123/spark/
>>>>  does not exist'.  Please specify one with --class.
>>>>         at
>>>> org.apache.spark.deploy.SparkSubmit.error(SparkSubmit.scala:972)
>>>>         at
>>>> org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:486)
>>>>         at org.apache.spark.deploy.SparkSubmit.org
>>>> $apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:898)
>>>>         at
>>>> org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
>>>>         at
>>>> org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
>>>>         at
>>>> org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
>>>>         at
>>>> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1043)
>>>>         at
>>>> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1052)
>>>>         at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>>>>
>>>>
>>>>
>>>> ~/spark/spark-3.2.1-bin-hadoop3.2/bin/spark-submit \
>>>>            --master k8s://${K8S_SERVER}:443 \
>>>>            --deploy-mode cluster \
>>>>            --name sparkBQ \
>>>>            --conf spark.kubernetes.namespace=$NAMESPACE \
>>>>            --conf spark.network.timeout=300 \
>>>>            --conf spark.executor.instances=3 \
>>>>            --conf spark.kubernetes.allocation.batch.size=3 \
>>>>            --conf spark.kubernetes.allocation.batch.delay=1 \
>>>>            --conf spark.driver.cores=3 \
>>>>            --conf spark.executor.cores=3 \
>>>>            --conf spark.driver.memory=8092m \
>>>>            --conf spark.executor.memory=8092m \
>>>>            --conf spark.dynamicAllocation.enabled=true \
>>>>            --conf spark.dynamicAllocation.shuffleTracking.enabled=true \
>>>>            --conf spark.kubernetes.driver.container.image=
>>>> ${SPARK_IMAGE} \
>>>>            --conf spark.kubernetes.executor.container.image=
>>>> ${SPARK_IMAGE} \
>>>>
>>>>            --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
>>>>            --conf spark.driver.extraJavaOptions=
>>>> "-Dio.netty.tryReflectionSetAccessible=true" \
>>>>            --conf spark.executor.extraJavaOptions=
>>>> "-Dio.netty.tryReflectionSetAccessible=true" \
>>>>            --class org.apache.spark.examples.SparkPi \
>>>>
>>>>            --jars /home/gnana_kumar123/spark/spark-3.2.1-bin-hadoop3.2/examples/jars/spark-examples_2.12-3.2.1.jar
>>>>
>>>> Thanks
>>>> GK
>>>>
>>>>
>>>>
>>>> On Wed, Feb 16, 2022 at 11:11 PM Gnana Kumar <gn...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Mich
>>>>>
>>>>> Also I would like to run Spark nodes ( Master and Worker nodes in
>>>>> Kubernetes) and then run my Java Spark application from a JAR file.
>>>>>
>>>>> Can you please let me know how to specify the JAR file and the MAIN
>>>>> class.
>>>>>
>>>>> Thanks
>>>>> GK
>>>>>
>>>>> On Wed, Feb 16, 2022 at 10:36 PM Gnana Kumar <gn...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Mich,
>>>>>>
>>>>>> I have built the image using the Dockerfile present
>>>>>> in spark-3.2.1-bin-hadoop3.2.tgz.
>>>>>>
>>>>>> Also I have pushed the same image to my docker hub account ie.
>>>>>> docker.io/gnanakumar123/spark3.2.1:latest
>>>>>>
>>>>>> I believe spark submit can pull image from docker hub when I run from
>>>>>> GKE's Cloud Shell. Please confirm.
>>>>>>
>>>>>> Below is the command I'm running.
>>>>>>
>>>>>> ./spark-submit \
>>>>>>   --master k8s://$K8S_SERVER \
>>>>>>   --deploy-mode cluster \
>>>>>>   --name spark-driver-pod \
>>>>>>   --class org.apache.spark.examples.SparkPi \
>>>>>>   --conf spark.executor.instances=2 \
>>>>>>   --conf spark.kubernetes.driver.container.image=
>>>>>> docker.io/gnanakumar123/spark3.2.1:latest \
>>>>>>   --conf spark.kubernetes.executor.container.image=
>>>>>> docker.io/gnanakumar123/spark3.2.1:latest \
>>>>>>   --conf spark.kubernetes.container.image=
>>>>>> docker.io/gnanakumar123/spark3.2.1:latest \
>>>>>>   --conf spark.kubernetes.driver.pod.name=spark-driver-pod \
>>>>>>   --conf spark.kubernetes.namespace=spark-demo \
>>>>>>   --conf spark.kubernetes.container.image.pullPolicy=Never \
>>>>>>   --conf
>>>>>> spark.kubernetes.authenticate.driver.serviceAccountName=spark \
>>>>>>     $SPARK_HOME/examples/jars/spark-examples_2.12-3.2.1.jar
>>>>>>
>>>>>> Thanks
>>>>>> GK
>>>>>>
>>>>>>
>>>>>> On Mon, Feb 14, 2022 at 10:50 PM Mich Talebzadeh <
>>>>>> mich.talebzadeh@gmail.com> wrote:
>>>>>>
>>>>>>> Hi
>>>>>>>
>>>>>>>
>>>>>>> It is complaining about the missing driver container image. Does
>>>>>>> $SPARK_IMAGE point to a valid image in the GCP container registry?
>>>>>>>
>>>>>>> Example for a docker image for PySpark
>>>>>>>
>>>>>>>
>>>>>>> IMAGEDRIVER="eu.gcr.io/
>>>>>>> <PROJRECT>/spark-py:3.1.1-scala_2.12-8-jre-slim-buster-java8PlusPackages"
>>>>>>>
>>>>>>>
>>>>>>>         spark-submit --verbose \
>>>>>>>
>>>>>>>            --properties-file ${property_file} \
>>>>>>>
>>>>>>>            --master k8s://https://$KUBERNETES_MASTER_IP:443 \
>>>>>>>
>>>>>>>            --deploy-mode cluster \
>>>>>>>
>>>>>>>            --name sparkBQ \
>>>>>>>
>>>>>>>            --py-files $CODE_DIRECTORY_CLOUD/spark_on_gke.zip \
>>>>>>>
>>>>>>>            --conf spark.kubernetes.namespace=$NAMESPACE \
>>>>>>>
>>>>>>>            --conf spark.network.timeout=300 \
>>>>>>>
>>>>>>>            --conf spark.executor.instances=$NEXEC \
>>>>>>>
>>>>>>>            --conf spark.kubernetes.allocation.batch.size=3 \
>>>>>>>
>>>>>>>            --conf spark.kubernetes.allocation.batch.delay=1 \
>>>>>>>
>>>>>>>            --conf spark.driver.cores=3 \
>>>>>>>
>>>>>>>            --conf spark.executor.cores=3 \
>>>>>>>
>>>>>>>            --conf spark.driver.memory=8092m \
>>>>>>>
>>>>>>>            --conf spark.executor.memory=8092m \
>>>>>>>
>>>>>>>            --conf spark.dynamicAllocation.enabled=true \
>>>>>>>
>>>>>>>            --conf
>>>>>>> spark.dynamicAllocation.shuffleTracking.enabled=true \
>>>>>>>
>>>>>>>            --conf
>>>>>>> spark.kubernetes.driver.container.image=${IMAGEDRIVER} \
>>>>>>>
>>>>>>>            --conf
>>>>>>> spark.kubernetes.executor.container.image=${IMAGEDRIVER} \
>>>>>>>
>>>>>>>            --conf
>>>>>>> spark.kubernetes.authenticate.driver.serviceAccountName=spark-bq \
>>>>>>>
>>>>>>>            --conf
>>>>>>> spark.driver.extraJavaOptions="-Dio.netty.tryReflectionSetAccessible=true" \
>>>>>>>
>>>>>>>            --conf
>>>>>>> spark.executor.extraJavaOptions="-Dio.netty.tryReflectionSetAccessible=true"
>>>>>>> \
>>>>>>>
>>>>>>>            $CODE_DIRECTORY_CLOUD/${APPLICATION}
>>>>>>>
>>>>>>> HTH
>>>>>>>
>>>>>>>
>>>>>>>    view my Linkedin profile
>>>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>>>>>
>>>>>>>
>>>>>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility
>>>>>>> for any loss, damage or destruction of data or any other property which may
>>>>>>> arise from relying on this email's technical content is explicitly
>>>>>>> disclaimed. The author will in no case be liable for any monetary damages
>>>>>>> arising from such loss, damage or destruction.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, 14 Feb 2022 at 17:04, Gnana Kumar <gn...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Also im using the below parameters while submitting the spark job.
>>>>>>>>
>>>>>>>> spark-submit \
>>>>>>>>   --master k8s://$K8S_SERVER \
>>>>>>>>   --deploy-mode cluster \
>>>>>>>>   --name $POD_NAME \
>>>>>>>>   --class org.apache.spark.examples.SparkPi \
>>>>>>>>   --conf spark.executor.instances=2 \
>>>>>>>>   --conf spark.kubernetes.driver.container.image=$SPARK_IMAGE \
>>>>>>>>   --conf spark.kubernetes.executor.container.image=$SPARK_IMAGE \
>>>>>>>>   --conf spark.kubernetes.container.image=$SPARK_IMAGE \
>>>>>>>>   --conf spark.kubernetes.driver.pod.name=$POD_NAME \
>>>>>>>>   --conf spark.kubernetes.namespace=spark-demo \
>>>>>>>>   --conf spark.kubernetes.container.image.pullPolicy=Never \
>>>>>>>>   --conf
>>>>>>>> spark.kubernetes.authenticate.driver.serviceAccountName=spark \
>>>>>>>>     $SPARK_HOME/examples/jars/spark-examples_2.12-3.2.1.jar
>>>>>>>>
>>>>>>>> On Mon, Feb 14, 2022 at 9:51 PM Gnana Kumar <
>>>>>>>> gnana.kumar123@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi There,
>>>>>>>>>
>>>>>>>>> I have been trying to run Spark 3.2.1 in Google Cloud's Kubernetes
>>>>>>>>> Cluster version 1.19 or 1.21
>>>>>>>>>
>>>>>>>>> But I kept on getting on following error and could not proceed.
>>>>>>>>>
>>>>>>>>> Please help me resolve this issue.
>>>>>>>>>
>>>>>>>>> 22/02/14 16:00:48 INFO SparkKubernetesClientFactory:
>>>>>>>>> Auto-configuring K8S client using current context from users K8S config file
>>>>>>>>> Exception in thread "main" org.apache.spark.SparkException: Must
>>>>>>>>> specify the driver container image
>>>>>>>>>         at
>>>>>>>>> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.$anonfun$driverContainerImage$1(BasicDriverFeatureStep.scala:45)
>>>>>>>>>         at scala.Option.getOrElse(Option.scala:189)
>>>>>>>>>         at
>>>>>>>>> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.<init>(BasicDriverFeatureStep.scala:45)
>>>>>>>>>         at
>>>>>>>>> org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.buildFromFeatures(KubernetesDriverBuilder.scala:46)
>>>>>>>>>         at
>>>>>>>>> org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:106)
>>>>>>>>>         at
>>>>>>>>> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$4(KubernetesClientApplication.scala:220)
>>>>>>>>>         at
>>>>>>>>> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$4$adapted(KubernetesClientApplication.scala:214)
>>>>>>>>>         at
>>>>>>>>> org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2713)
>>>>>>>>>         at
>>>>>>>>> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:214)
>>>>>>>>>         at
>>>>>>>>> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:186)
>>>>>>>>>         at org.apache.spark.deploy.SparkSubmit.org
>>>>>>>>> $apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:955)
>>>>>>>>>         at
>>>>>>>>> org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
>>>>>>>>>         at
>>>>>>>>> org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
>>>>>>>>>         at
>>>>>>>>> org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
>>>>>>>>>         at
>>>>>>>>> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1043)
>>>>>>>>>         at
>>>>>>>>> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1052)
>>>>>>>>>         at
>>>>>>>>> org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Thanks
>>>>>>>>> Gnana
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Thanks
>>>>>>>> Gnana
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> Thanks
>>>>>> Gnana
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Thanks
>>>>> Gnana
>>>>>
>>>>
>>>>
>>>> --
>>>> Thanks
>>>> Gnana
>>>>
>>>>
>>>> --
>>>> Thanks
>>>> Gnana
>>>>
>>>
>>
>> --
>> Thanks
>> Gnana
>>
>
>
> --
> Thanks
> Gnana
>

Re: Spark 3.2.1 in Google Kubernetes Version 1.19 or 1.21 - SparkSubmit Error

Posted by Gnana Kumar <gn...@gmail.com>.
Though I have created the kubernetes RBAC as per Spark site in my GKE
cluster,Im getting POD NAME null error.

kubectl create serviceaccount spark
kubectl create clusterrolebinding spark-role --clusterrole=edit
--serviceaccount=default:spark --namespace=default

On Thu, Feb 17, 2022 at 11:31 PM Gnana Kumar <gn...@gmail.com>
wrote:

> Hi Mich
>
> This is the latest error I'm stuck with. Please help me resolve this issue.
>
> Exception in thread "main"
> io.fabric8.kubernetes.client.KubernetesClientException: Operation: [create]
>  for kind: [Pod]  with name: [null]  in namespace: [default]  failed.
>
> ~/spark/spark-3.2.1-bin-hadoop3.2/bin/spark-submit  \
>            --verbose \
>            --class org.apache.spark.examples.SparkPi \
>            --master k8s://${K8S_SERVER}:443 \
>            --deploy-mode cluster \
>            --name sparkBQ \
>            --conf spark.kubernetes.namespace=$NAMESPACE \
>            --conf spark.network.timeout=300 \
>            --conf spark.executor.instances=3 \
>            --conf spark.kubernetes.allocation.batch.size=3 \
>            --conf spark.kubernetes.allocation.batch.delay=1 \
>            --conf spark.driver.cores=3 \
>            --conf spark.executor.cores=3 \
>            --conf spark.driver.memory=8092m \
>            --conf spark.executor.memory=8092m \
>            --conf spark.dynamicAllocation.enabled=true \
>            --conf spark.dynamicAllocation.shuffleTracking.enabled=true \
>            --conf spark.kubernetes.driver.pod.name=spark-pi-driver \
>            --conf spark.kubernetes.driver.container.image=${SPARK_IMAGE} \
>            --conf spark.kubernetes.executor.container.image=${SPARK_IMAGE}
>  \
>
>            --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
>            --conf spark.driver.extraJavaOptions=
> "-Dio.netty.tryReflectionSetAccessible=true" \
>            --conf spark.executor.extraJavaOptions=
> "-Dio.netty.tryReflectionSetAccessible=true"\
>            --conf spark.kubernetes.file.upload.path=file:///tmp \
>            local:///opt/spark/examples/jars/spark-examples_2.12-3.2.1.jar
>
> Thanks
> GK
>
> On Thu, Feb 17, 2022 at 6:55 PM Mich Talebzadeh <mi...@gmail.com>
> wrote:
>
>> Hi Gnana,
>>
>> That JAR file /home/gnana_kumar123/spark/spark-3.2.1-
>> bin-hadoop3.2/examples/jars/spark-examples_2.12-3.2.1.jar, is not
>> visible to the GKE cluster such that all nodes can read it. I suggest that
>> you put it on gs:// bucket in GCP and access it from there.
>>
>>
>> HTH
>>
>>
>>    view my Linkedin profile
>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>
>>
>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>
>>
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case be liable for any monetary damages
>> arising from such loss, damage or destruction.
>>
>>
>>
>>
>> On Thu, 17 Feb 2022 at 13:05, Gnana Kumar <gn...@gmail.com>
>> wrote:
>>
>>> Hi There,
>>>
>>> I'm getting below error though I pass --class and --jars values
>>> while submitting a spark job through Spark-Submit.
>>> Please help.
>>>
>>> Exception in thread "main" org.apache.spark.SparkException: Failed to
>>> get main class in JAR with error 'File file:/home/gnana_kumar123/spark/
>>>  does not exist'.  Please specify one with --class.
>>>         at
>>> org.apache.spark.deploy.SparkSubmit.error(SparkSubmit.scala:972)
>>>         at
>>> org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:486)
>>>         at org.apache.spark.deploy.SparkSubmit.org
>>> $apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:898)
>>>         at
>>> org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
>>>         at
>>> org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
>>>         at
>>> org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
>>>         at
>>> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1043)
>>>         at
>>> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1052)
>>>         at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>>>
>>>
>>>
>>> ~/spark/spark-3.2.1-bin-hadoop3.2/bin/spark-submit \
>>>            --master k8s://${K8S_SERVER}:443 \
>>>            --deploy-mode cluster \
>>>            --name sparkBQ \
>>>            --conf spark.kubernetes.namespace=$NAMESPACE \
>>>            --conf spark.network.timeout=300 \
>>>            --conf spark.executor.instances=3 \
>>>            --conf spark.kubernetes.allocation.batch.size=3 \
>>>            --conf spark.kubernetes.allocation.batch.delay=1 \
>>>            --conf spark.driver.cores=3 \
>>>            --conf spark.executor.cores=3 \
>>>            --conf spark.driver.memory=8092m \
>>>            --conf spark.executor.memory=8092m \
>>>            --conf spark.dynamicAllocation.enabled=true \
>>>            --conf spark.dynamicAllocation.shuffleTracking.enabled=true \
>>>            --conf spark.kubernetes.driver.container.image=${SPARK_IMAGE}
>>>  \
>>>            --conf spark.kubernetes.executor.container.image=
>>> ${SPARK_IMAGE} \
>>>
>>>            --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
>>>            --conf spark.driver.extraJavaOptions=
>>> "-Dio.netty.tryReflectionSetAccessible=true" \
>>>            --conf spark.executor.extraJavaOptions=
>>> "-Dio.netty.tryReflectionSetAccessible=true" \
>>>            --class org.apache.spark.examples.SparkPi \
>>>
>>>            --jars /home/gnana_kumar123/spark/spark-3.2.1-bin-hadoop3.2/examples/jars/spark-examples_2.12-3.2.1.jar
>>>
>>> Thanks
>>> GK
>>>
>>>
>>>
>>> On Wed, Feb 16, 2022 at 11:11 PM Gnana Kumar <gn...@gmail.com>
>>> wrote:
>>>
>>>> Hi Mich
>>>>
>>>> Also I would like to run Spark nodes ( Master and Worker nodes in
>>>> Kubernetes) and then run my Java Spark application from a JAR file.
>>>>
>>>> Can you please let me know how to specify the JAR file and the MAIN
>>>> class.
>>>>
>>>> Thanks
>>>> GK
>>>>
>>>> On Wed, Feb 16, 2022 at 10:36 PM Gnana Kumar <gn...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Mich,
>>>>>
>>>>> I have built the image using the Dockerfile present
>>>>> in spark-3.2.1-bin-hadoop3.2.tgz.
>>>>>
>>>>> Also I have pushed the same image to my docker hub account ie.
>>>>> docker.io/gnanakumar123/spark3.2.1:latest
>>>>>
>>>>> I believe spark submit can pull image from docker hub when I run from
>>>>> GKE's Cloud Shell. Please confirm.
>>>>>
>>>>> Below is the command I'm running.
>>>>>
>>>>> ./spark-submit \
>>>>>   --master k8s://$K8S_SERVER \
>>>>>   --deploy-mode cluster \
>>>>>   --name spark-driver-pod \
>>>>>   --class org.apache.spark.examples.SparkPi \
>>>>>   --conf spark.executor.instances=2 \
>>>>>   --conf spark.kubernetes.driver.container.image=
>>>>> docker.io/gnanakumar123/spark3.2.1:latest \
>>>>>   --conf spark.kubernetes.executor.container.image=
>>>>> docker.io/gnanakumar123/spark3.2.1:latest \
>>>>>   --conf spark.kubernetes.container.image=
>>>>> docker.io/gnanakumar123/spark3.2.1:latest \
>>>>>   --conf spark.kubernetes.driver.pod.name=spark-driver-pod \
>>>>>   --conf spark.kubernetes.namespace=spark-demo \
>>>>>   --conf spark.kubernetes.container.image.pullPolicy=Never \
>>>>>   --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark
>>>>> \
>>>>>     $SPARK_HOME/examples/jars/spark-examples_2.12-3.2.1.jar
>>>>>
>>>>> Thanks
>>>>> GK
>>>>>
>>>>>
>>>>> On Mon, Feb 14, 2022 at 10:50 PM Mich Talebzadeh <
>>>>> mich.talebzadeh@gmail.com> wrote:
>>>>>
>>>>>> Hi
>>>>>>
>>>>>>
>>>>>> It is complaining about the missing driver container image. Does
>>>>>> $SPARK_IMAGE point to a valid image in the GCP container registry?
>>>>>>
>>>>>> Example for a docker image for PySpark
>>>>>>
>>>>>>
>>>>>> IMAGEDRIVER="eu.gcr.io/
>>>>>> <PROJRECT>/spark-py:3.1.1-scala_2.12-8-jre-slim-buster-java8PlusPackages"
>>>>>>
>>>>>>
>>>>>>         spark-submit --verbose \
>>>>>>
>>>>>>            --properties-file ${property_file} \
>>>>>>
>>>>>>            --master k8s://https://$KUBERNETES_MASTER_IP:443 \
>>>>>>
>>>>>>            --deploy-mode cluster \
>>>>>>
>>>>>>            --name sparkBQ \
>>>>>>
>>>>>>            --py-files $CODE_DIRECTORY_CLOUD/spark_on_gke.zip \
>>>>>>
>>>>>>            --conf spark.kubernetes.namespace=$NAMESPACE \
>>>>>>
>>>>>>            --conf spark.network.timeout=300 \
>>>>>>
>>>>>>            --conf spark.executor.instances=$NEXEC \
>>>>>>
>>>>>>            --conf spark.kubernetes.allocation.batch.size=3 \
>>>>>>
>>>>>>            --conf spark.kubernetes.allocation.batch.delay=1 \
>>>>>>
>>>>>>            --conf spark.driver.cores=3 \
>>>>>>
>>>>>>            --conf spark.executor.cores=3 \
>>>>>>
>>>>>>            --conf spark.driver.memory=8092m \
>>>>>>
>>>>>>            --conf spark.executor.memory=8092m \
>>>>>>
>>>>>>            --conf spark.dynamicAllocation.enabled=true \
>>>>>>
>>>>>>            --conf
>>>>>> spark.dynamicAllocation.shuffleTracking.enabled=true \
>>>>>>
>>>>>>            --conf
>>>>>> spark.kubernetes.driver.container.image=${IMAGEDRIVER} \
>>>>>>
>>>>>>            --conf
>>>>>> spark.kubernetes.executor.container.image=${IMAGEDRIVER} \
>>>>>>
>>>>>>            --conf
>>>>>> spark.kubernetes.authenticate.driver.serviceAccountName=spark-bq \
>>>>>>
>>>>>>            --conf
>>>>>> spark.driver.extraJavaOptions="-Dio.netty.tryReflectionSetAccessible=true" \
>>>>>>
>>>>>>            --conf
>>>>>> spark.executor.extraJavaOptions="-Dio.netty.tryReflectionSetAccessible=true"
>>>>>> \
>>>>>>
>>>>>>            $CODE_DIRECTORY_CLOUD/${APPLICATION}
>>>>>>
>>>>>> HTH
>>>>>>
>>>>>>
>>>>>>    view my Linkedin profile
>>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>>>>
>>>>>>
>>>>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>>>>
>>>>>>
>>>>>>
>>>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility
>>>>>> for any loss, damage or destruction of data or any other property which may
>>>>>> arise from relying on this email's technical content is explicitly
>>>>>> disclaimed. The author will in no case be liable for any monetary damages
>>>>>> arising from such loss, damage or destruction.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, 14 Feb 2022 at 17:04, Gnana Kumar <gn...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Also im using the below parameters while submitting the spark job.
>>>>>>>
>>>>>>> spark-submit \
>>>>>>>   --master k8s://$K8S_SERVER \
>>>>>>>   --deploy-mode cluster \
>>>>>>>   --name $POD_NAME \
>>>>>>>   --class org.apache.spark.examples.SparkPi \
>>>>>>>   --conf spark.executor.instances=2 \
>>>>>>>   --conf spark.kubernetes.driver.container.image=$SPARK_IMAGE \
>>>>>>>   --conf spark.kubernetes.executor.container.image=$SPARK_IMAGE \
>>>>>>>   --conf spark.kubernetes.container.image=$SPARK_IMAGE \
>>>>>>>   --conf spark.kubernetes.driver.pod.name=$POD_NAME \
>>>>>>>   --conf spark.kubernetes.namespace=spark-demo \
>>>>>>>   --conf spark.kubernetes.container.image.pullPolicy=Never \
>>>>>>>   --conf
>>>>>>> spark.kubernetes.authenticate.driver.serviceAccountName=spark \
>>>>>>>     $SPARK_HOME/examples/jars/spark-examples_2.12-3.2.1.jar
>>>>>>>
>>>>>>> On Mon, Feb 14, 2022 at 9:51 PM Gnana Kumar <
>>>>>>> gnana.kumar123@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi There,
>>>>>>>>
>>>>>>>> I have been trying to run Spark 3.2.1 in Google Cloud's Kubernetes
>>>>>>>> Cluster version 1.19 or 1.21
>>>>>>>>
>>>>>>>> But I kept on getting on following error and could not proceed.
>>>>>>>>
>>>>>>>> Please help me resolve this issue.
>>>>>>>>
>>>>>>>> 22/02/14 16:00:48 INFO SparkKubernetesClientFactory:
>>>>>>>> Auto-configuring K8S client using current context from users K8S config file
>>>>>>>> Exception in thread "main" org.apache.spark.SparkException: Must
>>>>>>>> specify the driver container image
>>>>>>>>         at
>>>>>>>> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.$anonfun$driverContainerImage$1(BasicDriverFeatureStep.scala:45)
>>>>>>>>         at scala.Option.getOrElse(Option.scala:189)
>>>>>>>>         at
>>>>>>>> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.<init>(BasicDriverFeatureStep.scala:45)
>>>>>>>>         at
>>>>>>>> org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.buildFromFeatures(KubernetesDriverBuilder.scala:46)
>>>>>>>>         at
>>>>>>>> org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:106)
>>>>>>>>         at
>>>>>>>> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$4(KubernetesClientApplication.scala:220)
>>>>>>>>         at
>>>>>>>> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$4$adapted(KubernetesClientApplication.scala:214)
>>>>>>>>         at
>>>>>>>> org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2713)
>>>>>>>>         at
>>>>>>>> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:214)
>>>>>>>>         at
>>>>>>>> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:186)
>>>>>>>>         at org.apache.spark.deploy.SparkSubmit.org
>>>>>>>> $apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:955)
>>>>>>>>         at
>>>>>>>> org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
>>>>>>>>         at
>>>>>>>> org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
>>>>>>>>         at
>>>>>>>> org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
>>>>>>>>         at
>>>>>>>> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1043)
>>>>>>>>         at
>>>>>>>> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1052)
>>>>>>>>         at
>>>>>>>> org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>>>>>>>>
>>>>>>>> --
>>>>>>>> Thanks
>>>>>>>> Gnana
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Thanks
>>>>>>> Gnana
>>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> Thanks
>>>>> Gnana
>>>>>
>>>>
>>>>
>>>> --
>>>> Thanks
>>>> Gnana
>>>>
>>>
>>>
>>> --
>>> Thanks
>>> Gnana
>>>
>>>
>>> --
>>> Thanks
>>> Gnana
>>>
>>
>
> --
> Thanks
> Gnana
>


-- 
Thanks
Gnana

Re: Spark 3.2.1 in Google Kubernetes Version 1.19 or 1.21 - SparkSubmit Error

Posted by Gnana Kumar <gn...@gmail.com>.
Hi Mich

This is the latest error I'm stuck with. Please help me resolve this issue.

Exception in thread "main"
io.fabric8.kubernetes.client.KubernetesClientException: Operation: [create]
 for kind: [Pod]  with name: [null]  in namespace: [default]  failed.

~/spark/spark-3.2.1-bin-hadoop3.2/bin/spark-submit  \
           --verbose \
           --class org.apache.spark.examples.SparkPi \
           --master k8s://${K8S_SERVER}:443 \
           --deploy-mode cluster \
           --name sparkBQ \
           --conf spark.kubernetes.namespace=$NAMESPACE \
           --conf spark.network.timeout=300 \
           --conf spark.executor.instances=3 \
           --conf spark.kubernetes.allocation.batch.size=3 \
           --conf spark.kubernetes.allocation.batch.delay=1 \
           --conf spark.driver.cores=3 \
           --conf spark.executor.cores=3 \
           --conf spark.driver.memory=8092m \
           --conf spark.executor.memory=8092m \
           --conf spark.dynamicAllocation.enabled=true \
           --conf spark.dynamicAllocation.shuffleTracking.enabled=true \
           --conf spark.kubernetes.driver.pod.name=spark-pi-driver \
           --conf spark.kubernetes.driver.container.image=${SPARK_IMAGE} \
           --conf spark.kubernetes.executor.container.image=${SPARK_IMAGE} \
           --conf
spark.kubernetes.authenticate.driver.serviceAccountName=spark \
           --conf spark.driver.extraJavaOptions=
"-Dio.netty.tryReflectionSetAccessible=true" \
           --conf spark.executor.extraJavaOptions=
"-Dio.netty.tryReflectionSetAccessible=true"\
           --conf spark.kubernetes.file.upload.path=file:///tmp \
           local:///opt/spark/examples/jars/spark-examples_2.12-3.2.1.jar

Thanks
GK

On Thu, Feb 17, 2022 at 6:55 PM Mich Talebzadeh <mi...@gmail.com>
wrote:

> Hi Gnana,
>
> That JAR file /home/gnana_kumar123/spark/spark-3.2.1-
> bin-hadoop3.2/examples/jars/spark-examples_2.12-3.2.1.jar, is not visible
> to the GKE cluster such that all nodes can read it. I suggest that you put
> it on gs:// bucket in GCP and access it from there.
>
>
> HTH
>
>
>    view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Thu, 17 Feb 2022 at 13:05, Gnana Kumar <gn...@gmail.com>
> wrote:
>
>> Hi There,
>>
>> I'm getting below error though I pass --class and --jars values
>> while submitting a spark job through Spark-Submit.
>> Please help.
>>
>> Exception in thread "main" org.apache.spark.SparkException: Failed to get
>> main class in JAR with error 'File file:/home/gnana_kumar123/spark/  does
>> not exist'.  Please specify one with --class.
>>         at
>> org.apache.spark.deploy.SparkSubmit.error(SparkSubmit.scala:972)
>>         at
>> org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:486)
>>         at org.apache.spark.deploy.SparkSubmit.org
>> $apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:898)
>>         at
>> org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
>>         at
>> org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
>>         at
>> org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
>>         at
>> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1043)
>>         at
>> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1052)
>>         at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>>
>>
>>
>> ~/spark/spark-3.2.1-bin-hadoop3.2/bin/spark-submit \
>>            --master k8s://${K8S_SERVER}:443 \
>>            --deploy-mode cluster \
>>            --name sparkBQ \
>>            --conf spark.kubernetes.namespace=$NAMESPACE \
>>            --conf spark.network.timeout=300 \
>>            --conf spark.executor.instances=3 \
>>            --conf spark.kubernetes.allocation.batch.size=3 \
>>            --conf spark.kubernetes.allocation.batch.delay=1 \
>>            --conf spark.driver.cores=3 \
>>            --conf spark.executor.cores=3 \
>>            --conf spark.driver.memory=8092m \
>>            --conf spark.executor.memory=8092m \
>>            --conf spark.dynamicAllocation.enabled=true \
>>            --conf spark.dynamicAllocation.shuffleTracking.enabled=true \
>>            --conf spark.kubernetes.driver.container.image=${SPARK_IMAGE}
>>  \
>>            --conf spark.kubernetes.executor.container.image=
>> ${SPARK_IMAGE} \
>>
>>            --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
>>            --conf spark.driver.extraJavaOptions=
>> "-Dio.netty.tryReflectionSetAccessible=true" \
>>            --conf spark.executor.extraJavaOptions=
>> "-Dio.netty.tryReflectionSetAccessible=true" \
>>            --class org.apache.spark.examples.SparkPi \
>>
>>            --jars /home/gnana_kumar123/spark/spark-3.2.1-bin-hadoop3.2/examples/jars/spark-examples_2.12-3.2.1.jar
>>
>> Thanks
>> GK
>>
>>
>>
>> On Wed, Feb 16, 2022 at 11:11 PM Gnana Kumar <gn...@gmail.com>
>> wrote:
>>
>>> Hi Mich
>>>
>>> Also I would like to run Spark nodes ( Master and Worker nodes in
>>> Kubernetes) and then run my Java Spark application from a JAR file.
>>>
>>> Can you please let me know how to specify the JAR file and the MAIN
>>> class.
>>>
>>> Thanks
>>> GK
>>>
>>> On Wed, Feb 16, 2022 at 10:36 PM Gnana Kumar <gn...@gmail.com>
>>> wrote:
>>>
>>>> Hi Mich,
>>>>
>>>> I have built the image using the Dockerfile present
>>>> in spark-3.2.1-bin-hadoop3.2.tgz.
>>>>
>>>> Also I have pushed the same image to my docker hub account ie.
>>>> docker.io/gnanakumar123/spark3.2.1:latest
>>>>
>>>> I believe spark submit can pull image from docker hub when I run from
>>>> GKE's Cloud Shell. Please confirm.
>>>>
>>>> Below is the command I'm running.
>>>>
>>>> ./spark-submit \
>>>>   --master k8s://$K8S_SERVER \
>>>>   --deploy-mode cluster \
>>>>   --name spark-driver-pod \
>>>>   --class org.apache.spark.examples.SparkPi \
>>>>   --conf spark.executor.instances=2 \
>>>>   --conf spark.kubernetes.driver.container.image=
>>>> docker.io/gnanakumar123/spark3.2.1:latest \
>>>>   --conf spark.kubernetes.executor.container.image=
>>>> docker.io/gnanakumar123/spark3.2.1:latest \
>>>>   --conf spark.kubernetes.container.image=
>>>> docker.io/gnanakumar123/spark3.2.1:latest \
>>>>   --conf spark.kubernetes.driver.pod.name=spark-driver-pod \
>>>>   --conf spark.kubernetes.namespace=spark-demo \
>>>>   --conf spark.kubernetes.container.image.pullPolicy=Never \
>>>>   --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
>>>>     $SPARK_HOME/examples/jars/spark-examples_2.12-3.2.1.jar
>>>>
>>>> Thanks
>>>> GK
>>>>
>>>>
>>>> On Mon, Feb 14, 2022 at 10:50 PM Mich Talebzadeh <
>>>> mich.talebzadeh@gmail.com> wrote:
>>>>
>>>>> Hi
>>>>>
>>>>>
>>>>> It is complaining about the missing driver container image. Does
>>>>> $SPARK_IMAGE point to a valid image in the GCP container registry?
>>>>>
>>>>> Example for a docker image for PySpark
>>>>>
>>>>>
>>>>> IMAGEDRIVER="eu.gcr.io/
>>>>> <PROJRECT>/spark-py:3.1.1-scala_2.12-8-jre-slim-buster-java8PlusPackages"
>>>>>
>>>>>
>>>>>         spark-submit --verbose \
>>>>>
>>>>>            --properties-file ${property_file} \
>>>>>
>>>>>            --master k8s://https://$KUBERNETES_MASTER_IP:443 \
>>>>>
>>>>>            --deploy-mode cluster \
>>>>>
>>>>>            --name sparkBQ \
>>>>>
>>>>>            --py-files $CODE_DIRECTORY_CLOUD/spark_on_gke.zip \
>>>>>
>>>>>            --conf spark.kubernetes.namespace=$NAMESPACE \
>>>>>
>>>>>            --conf spark.network.timeout=300 \
>>>>>
>>>>>            --conf spark.executor.instances=$NEXEC \
>>>>>
>>>>>            --conf spark.kubernetes.allocation.batch.size=3 \
>>>>>
>>>>>            --conf spark.kubernetes.allocation.batch.delay=1 \
>>>>>
>>>>>            --conf spark.driver.cores=3 \
>>>>>
>>>>>            --conf spark.executor.cores=3 \
>>>>>
>>>>>            --conf spark.driver.memory=8092m \
>>>>>
>>>>>            --conf spark.executor.memory=8092m \
>>>>>
>>>>>            --conf spark.dynamicAllocation.enabled=true \
>>>>>
>>>>>            --conf spark.dynamicAllocation.shuffleTracking.enabled=true
>>>>> \
>>>>>
>>>>>            --conf
>>>>> spark.kubernetes.driver.container.image=${IMAGEDRIVER} \
>>>>>
>>>>>            --conf
>>>>> spark.kubernetes.executor.container.image=${IMAGEDRIVER} \
>>>>>
>>>>>            --conf
>>>>> spark.kubernetes.authenticate.driver.serviceAccountName=spark-bq \
>>>>>
>>>>>            --conf
>>>>> spark.driver.extraJavaOptions="-Dio.netty.tryReflectionSetAccessible=true" \
>>>>>
>>>>>            --conf
>>>>> spark.executor.extraJavaOptions="-Dio.netty.tryReflectionSetAccessible=true"
>>>>> \
>>>>>
>>>>>            $CODE_DIRECTORY_CLOUD/${APPLICATION}
>>>>>
>>>>> HTH
>>>>>
>>>>>
>>>>>    view my Linkedin profile
>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>>>
>>>>>
>>>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>>>
>>>>>
>>>>>
>>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>>>> any loss, damage or destruction of data or any other property which may
>>>>> arise from relying on this email's technical content is explicitly
>>>>> disclaimed. The author will in no case be liable for any monetary damages
>>>>> arising from such loss, damage or destruction.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Mon, 14 Feb 2022 at 17:04, Gnana Kumar <gn...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Also im using the below parameters while submitting the spark job.
>>>>>>
>>>>>> spark-submit \
>>>>>>   --master k8s://$K8S_SERVER \
>>>>>>   --deploy-mode cluster \
>>>>>>   --name $POD_NAME \
>>>>>>   --class org.apache.spark.examples.SparkPi \
>>>>>>   --conf spark.executor.instances=2 \
>>>>>>   --conf spark.kubernetes.driver.container.image=$SPARK_IMAGE \
>>>>>>   --conf spark.kubernetes.executor.container.image=$SPARK_IMAGE \
>>>>>>   --conf spark.kubernetes.container.image=$SPARK_IMAGE \
>>>>>>   --conf spark.kubernetes.driver.pod.name=$POD_NAME \
>>>>>>   --conf spark.kubernetes.namespace=spark-demo \
>>>>>>   --conf spark.kubernetes.container.image.pullPolicy=Never \
>>>>>>   --conf
>>>>>> spark.kubernetes.authenticate.driver.serviceAccountName=spark \
>>>>>>     $SPARK_HOME/examples/jars/spark-examples_2.12-3.2.1.jar
>>>>>>
>>>>>> On Mon, Feb 14, 2022 at 9:51 PM Gnana Kumar <gn...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi There,
>>>>>>>
>>>>>>> I have been trying to run Spark 3.2.1 in Google Cloud's Kubernetes
>>>>>>> Cluster version 1.19 or 1.21
>>>>>>>
>>>>>>> But I kept on getting on following error and could not proceed.
>>>>>>>
>>>>>>> Please help me resolve this issue.
>>>>>>>
>>>>>>> 22/02/14 16:00:48 INFO SparkKubernetesClientFactory:
>>>>>>> Auto-configuring K8S client using current context from users K8S config file
>>>>>>> Exception in thread "main" org.apache.spark.SparkException: Must
>>>>>>> specify the driver container image
>>>>>>>         at
>>>>>>> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.$anonfun$driverContainerImage$1(BasicDriverFeatureStep.scala:45)
>>>>>>>         at scala.Option.getOrElse(Option.scala:189)
>>>>>>>         at
>>>>>>> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.<init>(BasicDriverFeatureStep.scala:45)
>>>>>>>         at
>>>>>>> org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.buildFromFeatures(KubernetesDriverBuilder.scala:46)
>>>>>>>         at
>>>>>>> org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:106)
>>>>>>>         at
>>>>>>> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$4(KubernetesClientApplication.scala:220)
>>>>>>>         at
>>>>>>> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$4$adapted(KubernetesClientApplication.scala:214)
>>>>>>>         at
>>>>>>> org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2713)
>>>>>>>         at
>>>>>>> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:214)
>>>>>>>         at
>>>>>>> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:186)
>>>>>>>         at org.apache.spark.deploy.SparkSubmit.org
>>>>>>> $apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:955)
>>>>>>>         at
>>>>>>> org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
>>>>>>>         at
>>>>>>> org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
>>>>>>>         at
>>>>>>> org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
>>>>>>>         at
>>>>>>> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1043)
>>>>>>>         at
>>>>>>> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1052)
>>>>>>>         at
>>>>>>> org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>>>>>>>
>>>>>>> --
>>>>>>> Thanks
>>>>>>> Gnana
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Thanks
>>>>>> Gnana
>>>>>>
>>>>>
>>>>
>>>> --
>>>> Thanks
>>>> Gnana
>>>>
>>>
>>>
>>> --
>>> Thanks
>>> Gnana
>>>
>>
>>
>> --
>> Thanks
>> Gnana
>>
>>
>> --
>> Thanks
>> Gnana
>>
>

-- 
Thanks
Gnana

Re: Spark 3.2.1 in Google Kubernetes Version 1.19 or 1.21 - SparkSubmit Error

Posted by Mich Talebzadeh <mi...@gmail.com>.
Hi Gnana,

That JAR file /home/gnana_kumar123/spark/spark-3.2.1-
bin-hadoop3.2/examples/jars/spark-examples_2.12-3.2.1.jar, is not visible
to the GKE cluster such that all nodes can read it. I suggest that you put
it on gs:// bucket in GCP and access it from there.


HTH


   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>


 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Thu, 17 Feb 2022 at 13:05, Gnana Kumar <gn...@gmail.com> wrote:

> Hi There,
>
> I'm getting below error though I pass --class and --jars values
> while submitting a spark job through Spark-Submit.
> Please help.
>
> Exception in thread "main" org.apache.spark.SparkException: Failed to get
> main class in JAR with error 'File file:/home/gnana_kumar123/spark/  does
> not exist'.  Please specify one with --class.
>         at org.apache.spark.deploy.SparkSubmit.error(SparkSubmit.scala:972)
>         at
> org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:486)
>         at org.apache.spark.deploy.SparkSubmit.org
> $apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:898)
>         at
> org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
>         at
> org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
>         at
> org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
>         at
> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1043)
>         at
> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1052)
>         at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>
>
>
> ~/spark/spark-3.2.1-bin-hadoop3.2/bin/spark-submit \
>            --master k8s://${K8S_SERVER}:443 \
>            --deploy-mode cluster \
>            --name sparkBQ \
>            --conf spark.kubernetes.namespace=$NAMESPACE \
>            --conf spark.network.timeout=300 \
>            --conf spark.executor.instances=3 \
>            --conf spark.kubernetes.allocation.batch.size=3 \
>            --conf spark.kubernetes.allocation.batch.delay=1 \
>            --conf spark.driver.cores=3 \
>            --conf spark.executor.cores=3 \
>            --conf spark.driver.memory=8092m \
>            --conf spark.executor.memory=8092m \
>            --conf spark.dynamicAllocation.enabled=true \
>            --conf spark.dynamicAllocation.shuffleTracking.enabled=true \
>            --conf spark.kubernetes.driver.container.image=${SPARK_IMAGE} \
>            --conf spark.kubernetes.executor.container.image=${SPARK_IMAGE}
>  \
>
>            --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
>            --conf spark.driver.extraJavaOptions=
> "-Dio.netty.tryReflectionSetAccessible=true" \
>            --conf spark.executor.extraJavaOptions=
> "-Dio.netty.tryReflectionSetAccessible=true" \
>            --class org.apache.spark.examples.SparkPi \
>
>            --jars /home/gnana_kumar123/spark/spark-3.2.1-bin-hadoop3.2/examples/jars/spark-examples_2.12-3.2.1.jar
>
> Thanks
> GK
>
>
>
> On Wed, Feb 16, 2022 at 11:11 PM Gnana Kumar <gn...@gmail.com>
> wrote:
>
>> Hi Mich
>>
>> Also I would like to run Spark nodes ( Master and Worker nodes in
>> Kubernetes) and then run my Java Spark application from a JAR file.
>>
>> Can you please let me know how to specify the JAR file and the MAIN class.
>>
>> Thanks
>> GK
>>
>> On Wed, Feb 16, 2022 at 10:36 PM Gnana Kumar <gn...@gmail.com>
>> wrote:
>>
>>> Hi Mich,
>>>
>>> I have built the image using the Dockerfile present
>>> in spark-3.2.1-bin-hadoop3.2.tgz.
>>>
>>> Also I have pushed the same image to my docker hub account ie.
>>> docker.io/gnanakumar123/spark3.2.1:latest
>>>
>>> I believe spark submit can pull image from docker hub when I run from
>>> GKE's Cloud Shell. Please confirm.
>>>
>>> Below is the command I'm running.
>>>
>>> ./spark-submit \
>>>   --master k8s://$K8S_SERVER \
>>>   --deploy-mode cluster \
>>>   --name spark-driver-pod \
>>>   --class org.apache.spark.examples.SparkPi \
>>>   --conf spark.executor.instances=2 \
>>>   --conf spark.kubernetes.driver.container.image=
>>> docker.io/gnanakumar123/spark3.2.1:latest \
>>>   --conf spark.kubernetes.executor.container.image=
>>> docker.io/gnanakumar123/spark3.2.1:latest \
>>>   --conf spark.kubernetes.container.image=
>>> docker.io/gnanakumar123/spark3.2.1:latest \
>>>   --conf spark.kubernetes.driver.pod.name=spark-driver-pod \
>>>   --conf spark.kubernetes.namespace=spark-demo \
>>>   --conf spark.kubernetes.container.image.pullPolicy=Never \
>>>   --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
>>>     $SPARK_HOME/examples/jars/spark-examples_2.12-3.2.1.jar
>>>
>>> Thanks
>>> GK
>>>
>>>
>>> On Mon, Feb 14, 2022 at 10:50 PM Mich Talebzadeh <
>>> mich.talebzadeh@gmail.com> wrote:
>>>
>>>> Hi
>>>>
>>>>
>>>> It is complaining about the missing driver container image. Does
>>>> $SPARK_IMAGE point to a valid image in the GCP container registry?
>>>>
>>>> Example for a docker image for PySpark
>>>>
>>>>
>>>> IMAGEDRIVER="eu.gcr.io/
>>>> <PROJRECT>/spark-py:3.1.1-scala_2.12-8-jre-slim-buster-java8PlusPackages"
>>>>
>>>>
>>>>         spark-submit --verbose \
>>>>
>>>>            --properties-file ${property_file} \
>>>>
>>>>            --master k8s://https://$KUBERNETES_MASTER_IP:443 \
>>>>
>>>>            --deploy-mode cluster \
>>>>
>>>>            --name sparkBQ \
>>>>
>>>>            --py-files $CODE_DIRECTORY_CLOUD/spark_on_gke.zip \
>>>>
>>>>            --conf spark.kubernetes.namespace=$NAMESPACE \
>>>>
>>>>            --conf spark.network.timeout=300 \
>>>>
>>>>            --conf spark.executor.instances=$NEXEC \
>>>>
>>>>            --conf spark.kubernetes.allocation.batch.size=3 \
>>>>
>>>>            --conf spark.kubernetes.allocation.batch.delay=1 \
>>>>
>>>>            --conf spark.driver.cores=3 \
>>>>
>>>>            --conf spark.executor.cores=3 \
>>>>
>>>>            --conf spark.driver.memory=8092m \
>>>>
>>>>            --conf spark.executor.memory=8092m \
>>>>
>>>>            --conf spark.dynamicAllocation.enabled=true \
>>>>
>>>>            --conf spark.dynamicAllocation.shuffleTracking.enabled=true \
>>>>
>>>>            --conf
>>>> spark.kubernetes.driver.container.image=${IMAGEDRIVER} \
>>>>
>>>>            --conf
>>>> spark.kubernetes.executor.container.image=${IMAGEDRIVER} \
>>>>
>>>>            --conf
>>>> spark.kubernetes.authenticate.driver.serviceAccountName=spark-bq \
>>>>
>>>>            --conf
>>>> spark.driver.extraJavaOptions="-Dio.netty.tryReflectionSetAccessible=true" \
>>>>
>>>>            --conf
>>>> spark.executor.extraJavaOptions="-Dio.netty.tryReflectionSetAccessible=true"
>>>> \
>>>>
>>>>            $CODE_DIRECTORY_CLOUD/${APPLICATION}
>>>>
>>>> HTH
>>>>
>>>>
>>>>    view my Linkedin profile
>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>>
>>>>
>>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>>
>>>>
>>>>
>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>>> any loss, damage or destruction of data or any other property which may
>>>> arise from relying on this email's technical content is explicitly
>>>> disclaimed. The author will in no case be liable for any monetary damages
>>>> arising from such loss, damage or destruction.
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, 14 Feb 2022 at 17:04, Gnana Kumar <gn...@gmail.com>
>>>> wrote:
>>>>
>>>>> Also im using the below parameters while submitting the spark job.
>>>>>
>>>>> spark-submit \
>>>>>   --master k8s://$K8S_SERVER \
>>>>>   --deploy-mode cluster \
>>>>>   --name $POD_NAME \
>>>>>   --class org.apache.spark.examples.SparkPi \
>>>>>   --conf spark.executor.instances=2 \
>>>>>   --conf spark.kubernetes.driver.container.image=$SPARK_IMAGE \
>>>>>   --conf spark.kubernetes.executor.container.image=$SPARK_IMAGE \
>>>>>   --conf spark.kubernetes.container.image=$SPARK_IMAGE \
>>>>>   --conf spark.kubernetes.driver.pod.name=$POD_NAME \
>>>>>   --conf spark.kubernetes.namespace=spark-demo \
>>>>>   --conf spark.kubernetes.container.image.pullPolicy=Never \
>>>>>   --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark
>>>>> \
>>>>>     $SPARK_HOME/examples/jars/spark-examples_2.12-3.2.1.jar
>>>>>
>>>>> On Mon, Feb 14, 2022 at 9:51 PM Gnana Kumar <gn...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi There,
>>>>>>
>>>>>> I have been trying to run Spark 3.2.1 in Google Cloud's Kubernetes
>>>>>> Cluster version 1.19 or 1.21
>>>>>>
>>>>>> But I kept on getting on following error and could not proceed.
>>>>>>
>>>>>> Please help me resolve this issue.
>>>>>>
>>>>>> 22/02/14 16:00:48 INFO SparkKubernetesClientFactory: Auto-configuring
>>>>>> K8S client using current context from users K8S config file
>>>>>> Exception in thread "main" org.apache.spark.SparkException: Must
>>>>>> specify the driver container image
>>>>>>         at
>>>>>> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.$anonfun$driverContainerImage$1(BasicDriverFeatureStep.scala:45)
>>>>>>         at scala.Option.getOrElse(Option.scala:189)
>>>>>>         at
>>>>>> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.<init>(BasicDriverFeatureStep.scala:45)
>>>>>>         at
>>>>>> org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.buildFromFeatures(KubernetesDriverBuilder.scala:46)
>>>>>>         at
>>>>>> org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:106)
>>>>>>         at
>>>>>> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$4(KubernetesClientApplication.scala:220)
>>>>>>         at
>>>>>> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$4$adapted(KubernetesClientApplication.scala:214)
>>>>>>         at
>>>>>> org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2713)
>>>>>>         at
>>>>>> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:214)
>>>>>>         at
>>>>>> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:186)
>>>>>>         at org.apache.spark.deploy.SparkSubmit.org
>>>>>> $apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:955)
>>>>>>         at
>>>>>> org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
>>>>>>         at
>>>>>> org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
>>>>>>         at
>>>>>> org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
>>>>>>         at
>>>>>> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1043)
>>>>>>         at
>>>>>> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1052)
>>>>>>         at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>>>>>>
>>>>>> --
>>>>>> Thanks
>>>>>> Gnana
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Thanks
>>>>> Gnana
>>>>>
>>>>
>>>
>>> --
>>> Thanks
>>> Gnana
>>>
>>
>>
>> --
>> Thanks
>> Gnana
>>
>
>
> --
> Thanks
> Gnana
>
>
> --
> Thanks
> Gnana
>