You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@spark.apache.org by Mich Talebzadeh <mi...@gmail.com> on 2021/12/14 11:28:31 UTC

Sizing the driver & executor cores and memory in Kubernetes cluster

Hi,

I have a three node k8s cluster (GKE) in Google cloud with E2
standard machines that have 4 GB of system memory per VCPU giving 4 VPCU
and 16,384MB of RAM.

An optimum sizing of the number of executors, CPU and memory allocation is
important here. These are the assumptions:

1. You want to fit exactly one Spark executor pod per Kubernetes node
2. You should not starve the node OS, network etc from CPU usage
3. If you have 3 nodes, one node should be allocated to the driver and
two nodes to the executors
4. Regardless you want to execute the code ik8s as fast as possible

I don't think with the current architecture, one can force the driver node
to accommodate both the driver plus one executor at the same time. I did
some tests and looked at the available discussions here
<https://spark.apache.org/docs/latest/running-on-kubernetes.html>and here
<https://www.datamechanics.co/blog-post/setting-up-managing-monitoring-spark-on-kubernetes>
. One can fine tune various parameters, but these seem to be fine

--conf spark.executor.instances=2 \
--conf spark.driver.cores=3 \
--conf spark.executor.cores=3 \
--conf spark.driver.memory=8000m \
--conf spark.executor.memory=8000m \

What I am suggesting here is to leave one 1 VCPU out of 4 VCPUS to the OS
on each node. It is a safer bet to grab half of the memory available on
each node for the driver and executors. Your mileage varies because if you
try to allocate more memory, it will take longer for the driver and
executors to spin off (ContainerCreating), meaning that the execution time
will be longer. This could be offset if you are running a long job and you
care about allocating more available memory rather than the
ContainerCreation time. It would be interesting if others have done similar
configuration and their experience.

view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>

*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.

Re: Sizing the driver & executor cores and memory in Kubernetes cluster

Posted by Mich Talebzadeh <mi...@gmail.com>.

Following from the above, I did some tests on this with leaving 1 VCPU out
of 4 VCPUS to the OS on each node (Container & executor) in three node GKE
cluster. The RAM allocated to each node was 16GB. I then set the initial
container AND executor (memory 10 10% of RAM) and incremented these in
steps of 10% from 10% to 50% and measured the time taken for the code to
finish (from start to finish). Basically a simple

   start_time = time.time()

   end_time = time.time()

   time_elapsed = (end_time - start_time)

Which measured the completion time in seconds. The systematics were kept
the same for all measurements and only one measurement taken at each memory
setting .ie

 --conf spark.driver.memory= <Memory in MB> \

 -conf spark.executor.memory= <Memory in MB> \

 Memories were set the same for both the container and executors.

The result I got were as follows:

[image: image.png]

So it appears that allocating 50-60% of RAM to both the driver and
executors, provides an optimum value. Increasing the memory above 50% (say
@60% = 9830MB) will result in the container never been created (stuck at
pending), assuming it is trying to grab the memory as shown below

k describe pod sparkbq-b506ac7dc521b667-driver -n spark

 Events:

  Type     Reason             Age                   From
Message

  ----     ------             ----                  ----
-------

  Warning  FailedScheduling   17m                   default-scheduler   0/3
nodes are available: 3 Insufficient memory.

  Warning  FailedScheduling   17m                   default-scheduler   0/3
nodes are available: 3 Insufficient memory.

  Normal   NotTriggerScaleUp  2m28s (x92 over 17m)  cluster-autoscaler  pod
didn't trigger scale-up:

HTH

   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>

*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.

On Tue, 14 Dec 2021 at 11:28, Mich Talebzadeh <mi...@gmail.com>
wrote:

> Hi,
>
> I have a three node k8s cluster (GKE) in Google cloud with E2
> standard machines that have 4 GB of system memory per VCPU giving 4 VPCU
> and 16,384MB of RAM.
>
> An optimum sizing of the number of executors, CPU and memory allocation is
> important here. These are the assumptions:
>
>    1. You want to fit exactly one Spark executor pod per Kubernetes node
>    2. You should not starve the node OS, network etc from CPU usage
>    3. If you have 3 nodes, one node should be allocated to the driver and
>    two nodes to the executors
>    4. Regardless you want to execute the code ik8s as fast as possible
>
> I don't think with the current architecture, one can force the driver node
> to accommodate both the driver plus one executor at the same time. I did
> some tests and looked at the available discussions here
> <https://spark.apache.org/docs/latest/running-on-kubernetes.html>and here
> <https://www.datamechanics.co/blog-post/setting-up-managing-monitoring-spark-on-kubernetes>
> . One can fine tune various parameters, but these seem to be fine
>
>           --conf spark.executor.instances=2 \
>         --conf spark.driver.cores=3 \
>          --conf spark.executor.cores=3 \
>           --conf spark.driver.memory=8000m \
>           --conf spark.executor.memory=8000m \
>
> What I am suggesting here is to leave one 1 VCPU out of 4 VCPUS to the OS
> on each node. It is a safer bet to grab half of the memory available on
> each node for the driver and executors. Your mileage varies because if you
> try to allocate more memory, it will take longer for the driver and
> executors to spin off (ContainerCreating), meaning that the execution time
> will be longer. This could be offset if you are running a long job and you
> care about allocating more available memory rather than the
> ContainerCreation time. It would be interesting if others have done
> similar configuration and their experience.
>
>
>    view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>