You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tinkerpop.apache.org by HadoopMarc <m....@xs4all.nl> on 2016/02/01 20:42:09 UTC
Re: Ruminations on SparkGraphComputer at Scale
Hi Marko,
Thanks for your enthousiastic and useful report! We had similiar
experiences over here. SparkGraphcomputer seems to like small chunks of
data of 128MB or so, even if you have 8 or 16 Gb in your executors.
In addition, when running Spark/Yarn, you need a high spark.yarn.executor.memoryOverhead
value of about 20%, while 6-10% is mentioned in the SparkYarn reference
https://spark.apache.org/docs/1.5.2/running-on-yarn.html .
<https://spark.apache.org/docs/1.5.2/running-on-yarn.html>
Otherwise, the executor starves when Yarn is set to police queues.
I am sorry I cannot provide any quantative data, but I thought I'd mention
it anyway, to give people a hint which knobs to tune.
Cheers, Marc