You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by "lokesh.gidra" <lo...@gmail.com> on 2014/07/12 23:43:18 UTC

Scalability issue in Spark with SparkPageRank example

Hello, 


I ran SparkPageRank example (the one included in the package) to evaluate 
scale-in capability of Spark. I ran experiments on a 8-node 48-core 
AMD machine with local[N] master. But, for N > 10, the completion time 
of the experiment kept increasing, rather than decreasing. 

When I profiled it using Jprofiler, I observed that it wasn't any lock 
which consumed the CPU time. Instead, the amount of time spent in the 
following functions kept increasing: 

1) java.io.ObjectOutputStream.writeObject0 
2) scala.Tuple2.hashCode 

I confirmed the same with Oprofile as well. The findings are consistent. 

I am attaching the jstack output which I took twice during the whole 
execution with N=48 run. I ran the tests with Spark 1.0.0 and Spark 0.9.0

Can someone please suggest me what is wrong. 


Regards,
Lokesh Gidra

lessoutput3.lessoutput3
<http://apache-spark-user-list.1001560.n3.nabble.com/file/n9521/lessoutput3.lessoutput3>  
lessoutput4.lessoutput4
<http://apache-spark-user-list.1001560.n3.nabble.com/file/n9521/lessoutput4.lessoutput4>  



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Scalability-issue-in-Spark-with-SparkPageRank-example-tp9521.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.