You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Abhijeet Kumar <ab...@sentienz.com> on 2018/11/28 22:25:47 UTC

Spark streaming join on yarn

Hello Team,

I’m doing a simple join. I’m running Spark on Yarn and performing a simple two streaming join.


DAG of my job




So, it’s taking around 13 secs to complete stage 2.

My command to run jar:

spark-submit --packages org.apache.spark:spark-sql-kafka-0-10_2.11:2.1.0 --class com.streaming.spark.RealtimeTuning --master yarn --deploy-mode cluster --executor-memory 4G --driver-memory 2G --num-executors 1 ./target/scala-2.11/RealtimeTuning-assembly-0.1.jar

Note: I’m running everything locally(Single node cluster)

Any help would be appreciated.

Thank you,

Abhijeet Kumar
Software Development Engineer
Sentienz Solutions Private Limited

Re: Spark streaming join on yarn

Posted by Tathagata Das <ta...@gmail.com>.
How many tasks in the stage 2? How long do they take? If there are 200
tasks taking 1 second each (so many "rounds" of tasks on available cores
taking 13 seconds), then you can reduce the number tasks by setting the sql
conf spark.shuffle.partitions (defaults to 200).  Given the number of cores
in your cluster, you probably want to do 1-3 rounds of tasks, not more.

On Wed, Nov 28, 2018 at 2:28 PM Abhijeet Kumar <ab...@sentienz.com>
wrote:

> Hello Team,
>
> I’m doing a simple join. I’m running Spark on Yarn and performing a simple
> two streaming join.
>
> DAG of my job
>
>
>
> So, it’s taking around 13 secs to complete stage 2.
>
> My command to run jar:
>
> spark-submit --packages org.apache.spark:spark-sql-kafka-0-10_2.11:2.1.0
> --class com.streaming.spark.RealtimeTuning --master yarn --deploy-mode
> cluster --executor-memory 4G --driver-memory 2G --num-executors 1
> ./target/scala-2.11/RealtimeTuning-assembly-0.1.jar
>
> Note: I’m running everything locally(Single node cluster)
>
> Any help would be appreciated.
>
> Thank you,
>
> Abhijeet Kumar
> Software Development Engineer
> Sentienz Solutions Private Limited
>