You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Liang-Chi Hsieh (JIRA)" <ji...@apache.org> on 2015/04/23 03:18:38 UTC

[jira] [Commented] (SPARK-6378) srcAttr in graph.triplets don't update when the size of graph is huge

    [ https://issues.apache.org/jira/browse/SPARK-6378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14508262#comment-14508262 ] 

Liang-Chi Hsieh commented on SPARK-6378:
----------------------------------------

After checking the codes, looks like that whether the triplets are updated or not, is depending on if VD and VD2 are equal. If VD and VD2 are different, only vertices will be updated. Can you check the vertex data type of your small and big graphs are the same?

> srcAttr in graph.triplets don't update when the size of graph is huge
> ---------------------------------------------------------------------
>
>                 Key: SPARK-6378
>                 URL: https://issues.apache.org/jira/browse/SPARK-6378
>             Project: Spark
>          Issue Type: Bug
>          Components: GraphX
>    Affects Versions: 1.2.1
>            Reporter: zhangzhenyue
>
> when the size of the graph is huge(0.2 billion vertex, 6 billion edges), the srcAttr and dstAttr in graph.triplets don't update when using the Graph.outerJoinVertices(when the data in vertex is changed).
> the code and the log is as follows:
> {quote}
> g = graph.outerJoinVertices()...
> g,vertices,count()
> g.edges.count()
> println("example edge " + g.triplets.filter(e => e.srcId == 5000000001L).collect()
>       .map(e =>(e.srcId + ":" + e.srcAttr + ", " + e.dstId + ":" + e.dstAttr)).mkString("\n"))
>     println("example vertex " + g.vertices.filter(e => e._1 == 5000000001L).collect()
>       .map(e => (e._1 + "," + e._2)).mkString("\n"))
> {quote}
> the result:
> {quote}
> example edge 5000000001:0, 2467451620:61
> 5000000001:0, 1962741310:83 // attr of vertex 5000000001 is 0 in Graph.triplets
> example vertex 5000000001,2 // attr of vertex 5000000001 is 2 in Graph.vertices
> {quote}
> when the graph is smaller(10 million vertex), the code is OK, the triplets will update when the vertex is changed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org