You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by purna pradeep <pu...@gmail.com> on 2018/03/08 14:51:50 UTC

handling Remote dependencies for spark-submit in spark 2.3 with kubernetes

Im trying to run spark-submit to kubernetes cluster with spark 2.3 docker
container image

The challenge im facing is application have a mainapplication.jar and other
dependency files & jars which are located in Remote location like AWS s3
,but as per spark 2.3 documentation there is something called kubernetes
init-container to download remote dependencies but in this case im not
creating any Podspec to include init-containers in kubernetes, as per
documentation Spark 2.3 spark/kubernetes internally creates Pods
(driver,executor) So not sure how can i use init-container for spark-submit
when there are remote dependencies.

https://spark.apache.org/docs/latest/running-on-kubernetes.html#using-remote-dependencies

Please suggest

Re: handling Remote dependencies for spark-submit in spark 2.3 with kubernetes

Posted by Yinan Li <li...@gmail.com>.
One thing to note is you may need to have the S3 credentials in the
init-container unless you use a publicly accessible URL. If this is the
case, you can either create a Kubernetes secret and use the Spark config
option for mounting secrets (secrets will be mounted into the
init-container as well as into the main container), or you create a custom
init-container with the credentials baked in.

Yinan

On Thu, Mar 8, 2018 at 12:05 PM, Anirudh Ramanathan <
ramanathana@google.com.invalid> wrote:

> You don't need to create the init-container. It's an implementation detail.
> If you provide a remote uri, and specify spark.kubernetes.container.image=<spark-image>,
> Spark *internally* will add the init container to the pod spec for you.
> *If *for some reason, you want to customize the init container image, you
> can choose to do that using the specific options, but I don't think this is
> necessary in most scenarios. The init container image, driver and executor
> images can be identical by default.
>
>
> On Thu, Mar 8, 2018 at 6:52 AM purna pradeep <pu...@gmail.com>
> wrote:
>
>> Im trying to run spark-submit to kubernetes cluster with spark 2.3 docker
>> container image
>>
>> The challenge im facing is application have a mainapplication.jar and
>> other dependency files & jars which are located in Remote location like AWS
>> s3 ,but as per spark 2.3 documentation there is something called kubernetes
>> init-container to download remote dependencies but in this case im not
>> creating any Podspec to include init-containers in kubernetes, as per
>> documentation Spark 2.3 spark/kubernetes internally creates Pods
>> (driver,executor) So not sure how can i use init-container for spark-submit
>> when there are remote dependencies.
>>
>> https://spark.apache.org/docs/latest/running-on-kubernetes.
>> html#using-remote-dependencies
>>
>> Please suggest
>>
>
>
> --
> Anirudh Ramanathan
>

Re: handling Remote dependencies for spark-submit in spark 2.3 with kubernetes

Posted by Anirudh Ramanathan <ra...@google.com.INVALID>.
You don't need to create the init-container. It's an implementation detail.
If you provide a remote uri, and
specify spark.kubernetes.container.image=<spark-image>, Spark *internally*
will add the init container to the pod spec for you.
*If *for some reason, you want to customize the init container image, you
can choose to do that using the specific options, but I don't think this is
necessary in most scenarios. The init container image, driver and executor
images can be identical by default.


On Thu, Mar 8, 2018 at 6:52 AM purna pradeep <pu...@gmail.com>
wrote:

> Im trying to run spark-submit to kubernetes cluster with spark 2.3 docker
> container image
>
> The challenge im facing is application have a mainapplication.jar and
> other dependency files & jars which are located in Remote location like AWS
> s3 ,but as per spark 2.3 documentation there is something called kubernetes
> init-container to download remote dependencies but in this case im not
> creating any Podspec to include init-containers in kubernetes, as per
> documentation Spark 2.3 spark/kubernetes internally creates Pods
> (driver,executor) So not sure how can i use init-container for spark-submit
> when there are remote dependencies.
>
>
> https://spark.apache.org/docs/latest/running-on-kubernetes.html#using-remote-dependencies
>
> Please suggest
>


-- 
Anirudh Ramanathan