You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@spark.apache.org by mkhaitman <ma...@chango.com> on 2015/02/09 22:26:48 UTC

pyspark.daemon issues?

I've noticed a couple oddities with the pyspark.daemons which are causing us
a bit of memory problems within some of our heavy spark jobs, especially
when they run at the same time...

It seems that there is typically a 1-to-1 ratio of pyspark.daemons to cores
per executor during aggregations. By default the spark.python.worker.memory
is left at the default of 512MB, after which, the remainder of the
aggregations are supposed to spill to disk.

However:
*1)* I'm not entirely sure what cases would result in random
numbers of pyspark daemons which do not respect the python worker memory
limit. I've seen some go up to as far as 2GB each (well over the 512MB
limit) which is when we run into some crazy memory problems for jobs making
use of many cores on each executor. To be clear here, they ARE spilling to
disk as well, but also blowing past the memory limits at the same time
somehow.

*2)* Another scenario specifically relates to when we want to join
RDDs, where for example, say there are 4 cores per executor, and therefore 4
pyspark daemons during most aggregations. It seems that if a Join occurs, it
will spawn up 4 additional pyspark daemons as opposed to simply re-using the
ones that were already present during the aggregation stage that occurred
before it. This, combined with the case where the python worker memory limit
is not strictly respected, can pose problems for using way more memory per
node.

The fact that the python worker memory appears to use memory *outside* of
the executor memory is what poses the biggest challenge for preventing
memory depletion on a node. Is there something obvious, or some environment
variable I may have missed that could potentially help with one/both of the
above memory concerns? Alternatively, any suggestions would be greatly
appreciated! :)

Thanks,
Mark.

--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/pyspark-daemon-issues-tp10533.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org