You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by moxing <do...@alibaba-inc.com> on 2014/03/25 09:42:35 UTC
graph.persist error
Hi,
I am dealing with a graph consisting of 20 million nodes and 2 billion
edges. When I want to persist the graph then an exception throw :
Caused by: java.lang.UnsupportedOperationException: Cannot change storage
level of an RDD after it was already assigned a leve
Here is my code:
def main(args: Array[String]) {
if (args.length == 0) {
System.err.println("Usage: Graph_on_Spark [master] <slices>")
System.exit(1)
}
val sc = new SparkContext(args(0), "Graph_on_Spark",
System.getenv("SPARK_HOME"), Seq(System.getenv("SPARK_EXAMPLES_JAR")))
val hdfspath = ""
var userRDD = sc.textFile(…)
var edgeRDD:RDD[Edge[String]] = sc.textFile(…)
for( no <- 1 to 4){
val vertexfile = sc.textFile(…)
userRDD = userRDD.union( vertexfile.map{… } )
val edgefile = sc.textFile(…)
edgeRDD = edgeRDD.union( …)
}
val graph = Graph(userRDD,edgeRDD,"Empty")
println(graph.vertices.count)
println(graph.edges.count)
println("graph form success")
val initialgraph = graph.persist(storage.StorageLevel.DISK_ONLY)
I don’t have no operation such as cache or persist before.
Another question, when execute code below, would get a exception:
Exception failure: java.lang.ArrayIndexOutOfBoundsException
while(i < maxIter){
println("Iteration")
println(g.vertices.count)
val newVerts = g.vertices.innerJoin(messages)(pregel_vprog)
g = g.outerJoinVertices(newVerts) { (vid,old,newOpt) =>
newOpt.getOrElse((old._1,"")) }
println(g.vertices.count)
messages =
g.mapReduceTriplets[String](pregel_sendMsg,pregel_mergeFunc,Some((newVerts,activeDir)))
println(g.vertices.count)
i += 1
}
Thanks
--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/graph-persist-error-tp3179.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.