You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tinkerpop.apache.org by HadoopMarc <m....@xs4all.nl> on 2016/02/01 20:42:09 UTC

Re: Ruminations on SparkGraphComputer at Scale

Hi Marko,

Thanks for your enthousiastic and useful report! We had similiar 
experiences over here. SparkGraphcomputer seems to like small chunks of 
data of 128MB or so, even if you have 8 or 16 Gb in your executors.

In addition, when running Spark/Yarn, you need a high spark.yarn.executor.memoryOverhead 
value of about 20%, while 6-10% is mentioned in the SparkYarn reference 
https://spark.apache.org/docs/1.5.2/running-on-yarn.html . 
<https://spark.apache.org/docs/1.5.2/running-on-yarn.html>
Otherwise, the executor starves when Yarn is set to police queues.
I am sorry I cannot provide any quantative data, but I thought I'd mention 
it anyway, to give people a hint which knobs to tune.

Cheers,     Marc