You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by "barge.nilesh" <ba...@gmail.com> on 2015/07/01 02:26:57 UTC

Re: run reduceByKey on huge data in spark

"I 'm using 50 servers , 35 executors per server, 140GB memory per server"

35 executors *per server* sounds kind of odd to me.

With 35 executors per server and server having 140gb, meaning each executor
is going to get only 4gb, 4gb will be divided in to shuffle/storage memory
fractions... assuming storage memory fraction=0.6 as default then 2.4gb
working space for each executor, so if any of the partition size (key group
size) exceeds 2.4gb there will be OOM...

May be you can try with the less number of executors per server/node...






--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/run-reduceByKey-on-huge-data-in-spark-tp23546p23555.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org