You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by "lokesh.gidra" <lo...@gmail.com> on 2014/07/12 23:43:18 UTC
Scalability issue in Spark with SparkPageRank example
Hello,
I ran SparkPageRank example (the one included in the package) to evaluate
scale-in capability of Spark. I ran experiments on a 8-node 48-core
AMD machine with local[N] master. But, for N > 10, the completion time
of the experiment kept increasing, rather than decreasing.
When I profiled it using Jprofiler, I observed that it wasn't any lock
which consumed the CPU time. Instead, the amount of time spent in the
following functions kept increasing:
1) java.io.ObjectOutputStream.writeObject0
2) scala.Tuple2.hashCode
I confirmed the same with Oprofile as well. The findings are consistent.
I am attaching the jstack output which I took twice during the whole
execution with N=48 run. I ran the tests with Spark 1.0.0 and Spark 0.9.0
Can someone please suggest me what is wrong.
Regards,
Lokesh Gidra
lessoutput3.lessoutput3
<http://apache-spark-user-list.1001560.n3.nabble.com/file/n9521/lessoutput3.lessoutput3>
lessoutput4.lessoutput4
<http://apache-spark-user-list.1001560.n3.nabble.com/file/n9521/lessoutput4.lessoutput4>
--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Scalability-issue-in-Spark-with-SparkPageRank-example-tp9521.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.