You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by moxing <do...@alibaba-inc.com> on 2014/03/25 09:42:35 UTC
graph.persist error

Hi,
I am dealing with a graph consisting of 20 million nodes and 2 billion
edges. When I want to persist the graph then an exception throw :
   Caused by: java.lang.UnsupportedOperationException: Cannot change storage
level of an RDD after it was already assigned a leve
Here is my code:
   def main(args: Array[String]) {
    if (args.length == 0) {
      System.err.println("Usage: Graph_on_Spark [master] <slices>")
      System.exit(1)
    }
    val sc = new SparkContext(args(0), "Graph_on_Spark",
      System.getenv("SPARK_HOME"), Seq(System.getenv("SPARK_EXAMPLES_JAR")))
    val hdfspath = ""
    var userRDD = sc.textFile(…)
    var edgeRDD:RDD[Edge[String]] = sc.textFile(…)
    for( no <- 1 to 4){
        val vertexfile = sc.textFile(…)
        userRDD = userRDD.union( vertexfile.map{… } )
        val edgefile = sc.textFile(…)
        edgeRDD = edgeRDD.union( …)
    }
    val graph = Graph(userRDD,edgeRDD,"Empty")
    println(graph.vertices.count)
    println(graph.edges.count)
    println("graph form success")
    val initialgraph = graph.persist(storage.StorageLevel.DISK_ONLY)

I don’t have no operation such as cache or persist before. 
Another question, when execute code below, would get a exception:
 Exception failure: java.lang.ArrayIndexOutOfBoundsException

   while(i < maxIter){
        println("Iteration")
        println(g.vertices.count)
        val newVerts = g.vertices.innerJoin(messages)(pregel_vprog)
        g = g.outerJoinVertices(newVerts) { (vid,old,newOpt) =>
newOpt.getOrElse((old._1,"")) }
        println(g.vertices.count)
        messages =
g.mapReduceTriplets[String](pregel_sendMsg,pregel_mergeFunc,Some((newVerts,activeDir)))
        println(g.vertices.count)
        i += 1
     }
   



Thanks




--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/graph-persist-error-tp3179.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.