You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by thrisha <ts...@threatmetrix.com> on 2019/03/27 22:52:46 UTC

Spark migration to Kubernetes

Hi All,



We have Spark Streaming pipelines(written in java) currently running on yarn
in production. We are evaluating moving these streaming pipelines onto
Kubernetes. We had set up a working Kubernetes cluster. I have been reading
Spark documentation and a few other blogs on migrating them to Kubernetes.



1. But, it's not very clear on how to migrate existing pipelines to Spark on
Kubernetes. Any pointers on this would be helpful.



2. Also, I am trying to run sample wordcount example using the commands from
documentation(https://spark.apache.org/docs/2.4.0/running-on-kubernetes.html#cluster-mode). However,
I am not able to figure out a way to pass in Spark docker image as one of
the conf (spark.kubernetes.container.image). Our machines have no access to
the internet and so I have pre-loaded a spark docker image available at
gcr.io manually to our docker images. So, how should be my spark-submit
command?



3. Would specifying spark.kubernetes.container.image.pullPolicy=IfNotPresent
would only try to pull the docker image if it's not existing in the docker
list already?



Any help in answering the above questions would be appreciated.



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org