You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Michel Sumbul <mi...@gmail.com> on 2020/06/29 09:34:58 UTC

Re: Spark 3 pod template for the driver

Hello,

Adding the dev mailing list maybe there is someone here that can help to
have/show a valid/accepted pod template for spark 3?

Thanks in advance,
Michel


Le ven. 26 juin 2020 à 14:03, Michel Sumbul <mi...@gmail.com> a
écrit :

> Hi Jorge,
> If I set that in the spark submit command it works but I want it only in
> the pod template file.
>
> Best regards,
> Michel
>
> Le ven. 26 juin 2020 à 14:01, Jorge Machado <jo...@me.com> a écrit :
>
>> Try to set spark.kubernetes.container.image
>>
>> On 26. Jun 2020, at 14:58, Michel Sumbul <mi...@gmail.com> wrote:
>>
>> Hi guys,
>>
>> I try to use Spark 3 on top of Kubernetes and to specify a pod template
>> for the driver.
>>
>> Here is my pod manifest or the driver and when I do a spark-submit with
>> the option:
>> --conf
>> spark.kubernetes.driver.podTemplateFile=/data/k8s/podtemplate_driver3.yaml
>>
>> I got the error message that I need to specify an image, but it's the
>> manifest.
>> Does my manifest file is wrong, How should it look like?
>>
>> Thanks for your help,
>> Michel
>>
>> --------
>> The pod manifest:
>>
>> apiVersion: v1
>> kind: Pod
>> metadata:
>>   name: mySpark3App
>>   labels:
>>     app: mySpark3App
>>     customlabel/app-id: "1"
>> spec:
>>   securityContext:
>>     runAsUser: 1000
>>   volumes:
>>     - name: "test-volume"
>>       emptyDir: {}
>>   containers:
>>     - name: spark3driver
>>       image: mydockerregistry.example.com/images/dev/spark3:latest
>>       instances: 1
>>       resources:
>>         requests:
>>           cpu: "1000m"
>>           memory: "512Mi"
>>         limits:
>>           cpu: "1000m"
>>           memory: "512Mi"
>>       volumeMounts:
>>        - name: "test-volume"
>>          mountPath: "/tmp"
>>
>>
>>

Re: Spark 3 pod template for the driver

Posted by Edward Mitchell <ed...@gmail.com>.
If I had to guess, it's likely because the Spark code would have to read
the YAML to make sure the required parameters are set, and the way it's
done was just easier to build on without a lot of refactoring.

On Mon, Jul 6, 2020 at 5:06 PM Michel Sumbul <mi...@yahoo.fr> wrote:

> Thanks Edward for the reply!
>
> I have the impression thats also the case for other settings like memory
> requested, is it right?
> I think I will create a ticket, to allow the user to specify any
> configuration in a yaml file instead of having a long list of --conf
> parameter when submitting the job.
> Except if there has been reasons not doing it like that from the beginning?
>
> thanks,
> Michel
>
> Le jeudi 2 juillet 2020 à 00:43:25 UTC+1, Edward Mitchell <
> edeesis@gmail.com> a écrit :
>
>
> Okay, I see what's going on here.
>
> Looks like the way that spark is coded, the driver container image
> (specified by --conf
> spark.kubernetes.driver.container.image) and executor container image
> (specified by --conf
> spark.kubernetes.executor.container.image) is required.
>
> If they're not specified it'll fallback to --conf
> spark.kubernetes.container.image.
>
> The way the "pod template" feature was coded is such that even if it's
> specified in the YAML, those conf properties take priority and override the
> value set on the YAML file.
>
> So basically what I'm saying is that although you have it in the YAML
> file, you still need to specify them.
>
> If, like you said, the goal is to not specify those in the spark submit,
> you'll likely need to submit an Improvement to the JIRA.
>
> On Tue, Jun 30, 2020 at 5:26 AM Michel Sumbul <mi...@gmail.com>
> wrote:
>
> Hi Edeesis,
>
> The goal is to not have these settings in the spark submit command. If I
> specify the same things in a pod template for the executor, I still got the
> message:
> "Exception in thread "main" org.apache.spark.SparkException "Must specify
> the driver container image"
>
> it even don't try to start an executor container as the driver is not
> started yet.
> Any idea?
>
> Thanks,
> Michel
>
> Le mar. 30 juin 2020 à 00:06, edeesis <ed...@gmail.com> a écrit :
>
> If I could muster a guess, you still need to specify the executor image. As
> is, this will only specify the driver image.
>
> You can specify it as --conf spark.kubernetes.container.image or --conf
> spark.kubernetes.executor.container.image
>
>
>
> --
> Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>
>

Re: Spark 3 pod template for the driver

Posted by Edward Mitchell <ed...@gmail.com>.
Okay, I see what's going on here.

Looks like the way that spark is coded, the driver container image
(specified by --conf
spark.kubernetes.driver.container.image) and executor container image
(specified by --conf
spark.kubernetes.executor.container.image) is required.

If they're not specified it'll fallback to --conf
spark.kubernetes.container.image.

The way the "pod template" feature was coded is such that even if it's
specified in the YAML, those conf properties take priority and override the
value set on the YAML file.

So basically what I'm saying is that although you have it in the YAML file,
you still need to specify them.

If, like you said, the goal is to not specify those in the spark submit,
you'll likely need to submit an Improvement to the JIRA.

On Tue, Jun 30, 2020 at 5:26 AM Michel Sumbul <mi...@gmail.com>
wrote:

> Hi Edeesis,
>
> The goal is to not have these settings in the spark submit command. If I
> specify the same things in a pod template for the executor, I still got the
> message:
> "Exception in thread "main" org.apache.spark.SparkException "Must specify
> the driver container image"
>
> it even don't try to start an executor container as the driver is not
> started yet.
> Any idea?
>
> Thanks,
> Michel
>
> Le mar. 30 juin 2020 à 00:06, edeesis <ed...@gmail.com> a écrit :
>
>> If I could muster a guess, you still need to specify the executor image.
>> As
>> is, this will only specify the driver image.
>>
>> You can specify it as --conf spark.kubernetes.container.image or --conf
>> spark.kubernetes.executor.container.image
>>
>>
>>
>> --
>> Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>
>>

Re: Spark 3 pod template for the driver

Posted by Michel Sumbul <mi...@gmail.com>.
Hi Edeesis,

The goal is to not have these settings in the spark submit command. If I
specify the same things in a pod template for the executor, I still got the
message:
"Exception in thread "main" org.apache.spark.SparkException "Must specify
the driver container image"

it even don't try to start an executor container as the driver is not
started yet.
Any idea?

Thanks,
Michel

Le mar. 30 juin 2020 à 00:06, edeesis <ed...@gmail.com> a écrit :

> If I could muster a guess, you still need to specify the executor image. As
> is, this will only specify the driver image.
>
> You can specify it as --conf spark.kubernetes.container.image or --conf
> spark.kubernetes.executor.container.image
>
>
>
> --
> Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>
>

Re: Spark 3 pod template for the driver

Posted by edeesis <ed...@gmail.com>.
If I could muster a guess, you still need to specify the executor image. As
is, this will only specify the driver image.

You can specify it as --conf spark.kubernetes.container.image or --conf
spark.kubernetes.executor.container.image



--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org