You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Javier Pareja <pa...@gmail.com> on 2018/05/21 08:53:40 UTC

Executors slow down when running on the same node

Hello,

I have a Spark Streaming job reading data from kafka, processing it and
inserting it into Cassandra. The job is running on a cluster with 3
machines. I use mesos to submit the job with 3 executors using 1 core each.
The problem is that when all executors are running on the same node, the
insertion stage onto Cassandra becomes x5 times slower. When the executors
run one on each machine, the insertion runs as expected.

I am using Datastax cassandra driver for the insertion of the stream.

For now all that I can do is to kill the submission and try again. Because
I can't predicts how Mesos assigns resources, I might have to submit it
several times until it works.
Does anyone know what could be wrong?  Any idea of what can I look into?
Network, Max Host Connections, shared VM...?

Javier Pareja