You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tinkerpop.apache.org by "Marko A. Rodriguez (JIRA)" <ji...@apache.org> on 2016/10/31 16:49:58 UTC

[jira] [Commented] (TINKERPOP-1118) SparkGraphComputer should use StarGraph, not VertexWritable.

    [ https://issues.apache.org/jira/browse/TINKERPOP-1118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15622698#comment-15622698 ] 

Marko A. Rodriguez commented on TINKERPOP-1118:
-----------------------------------------------

I think we can get rid of the {{VertexWritable}}/{{ObjectWritable}} serialization issues if we solve this ticket. cc/ [~dalaro]

Right now, {{VertexWritable}} and {{ObjectWritable}} have their own serialization logic. This is important as these classes are used outside of just running jobs, but also for reading and writing {{SequenceFiles}}. In Spark, we don't need to have the RDD use these writables and in fact, can just directly reference the objects they wrap. In this way, we could have a better split between {{GryoInput/OutputFormat}} and the internal job serialization (message passing and the like).

> SparkGraphComputer should use StarGraph, not VertexWritable.
> ------------------------------------------------------------
>
>                 Key: TINKERPOP-1118
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-1118
>             Project: TinkerPop
>          Issue Type: Improvement
>          Components: hadoop
>    Affects Versions: 3.1.1-incubating
>            Reporter: Marko A. Rodriguez
>              Labels: breaking
>             Fix For: 3.3.0
>
>
> {{SparkGraphComputer}} input RDDs are typed as:
> {code}
> JavaPairRDD<Object,VertexWritable>
> {code}
> The {{VertexWritable}} usage is a vestige from Hadoop and Giraph. In Spark, we don't need to have this wrapper and thus, we can reduce the overhead (one less object header) by making the input RDDs typed as:
> {code}
> JavaPairRDD<Object,StarGraph>
> {code}
> This would be a breaking change for graph providers that implement their own {{InputRDD}} and {{OutputRDD}}, however, the fix is trivial. Instead of {{new VertexWritable(vertex)}}, they would simply do {{StarGraph.of(vertex)}}. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)