You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Anton Puzanov <an...@gmail.com> on 2018/08/01 07:30:34 UTC

How to make Yarn dynamically allocate resources for Spark

Hi everyone,

have a cluster managed with Yarn and runs Spark jobs, the components were
installed using Ambari (2.6.3.0-235). I have 6 hosts each with 6 cores. I
use Fair scheduler

I want Yarn to automatically add/remove executor cores, but no matter what
I do it doesn't work

Relevant Spark configuration (configured in Ambari):

spark.dynamicAllocation.schedulerBacklogTimeout 10s
spark.dynamicAllocation.sustainedSchedulerBacklogTimeout 5s
spark.driver.memory 4G
spark.dynamicAllocation.enabled true
spark.dynamicAllocation.initialExecutors 6 (has no effect - starts with 2)
spark.dynamicAllocation.maxExecutors 10
spark.dynamicAllocation.minExecutors 1
spark.scheduler.mode FAIR
spark.shuffle.service.enabled true
SPARK_EXECUTOR_MEMORY="36G"

Relevant Yarn configuration (configured in Ambari):
yarn.nodemanager.aux-services mapreduce_shuffle,spark_shuffle,spark2_shuffle
YARN Java heap size 4096
yarn.resourcemanager.scheduler.class
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler
yarn.scheduler.fair.preemption true
yarn.nodemanager.aux-services.spark2_shuffle.class
org.apache.spark.network.yarn.YarnShuffleService
yarn.nodemanager.aux-services.spark2_shuffle.classpath
{{stack_root}}/${hdp.version}/spark2/aux/*
yarn.nodemanager.aux-services.spark_shuffle.class
org.apache.spark.network.yarn.YarnShuffleService
yarn.nodemanager.aux-services.spark_shuffle.classpath
{{stack_root}}/${hdp.version}/spark/aux/*
Minimum Container Size (VCores) 0
Maximum Container Size (VCores) 12
Number of virtual cores 12


Also I followed Dynamic resource allocation
<http://spark.apache.org/docs/latest/job-scheduling.html#dynamic-resource-allocation>
and passed all the steps to configure external shuffle service, I
copied the yarn-shuffle jar:

cp /usr/hdp/2.6.3.0-235/spark/aux/spark-2.2.0.2.6.3.0-235-yarn-shuffle.jar
/usr/hdp/2.6.3.0-235/hadoop-yarn/lib/

I see only 3 cores are allocated to the application (deafult executors
is 2 so I guess its driver+2,
Although many tasks are pending.

If it it relevant, I use Jupyter Notebook and findspark to connect to
the cluster:
import findspark
findspark.init()
spark = SparkSession.builder.appName("internal-external2").getOrCreate()


I would really appreciate any suggestion/help, there is no manual on
that topic I didn't try.
thx a lot,
Anton