You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@tinkerpop.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/12/05 08:51:00 UTC

[jira] [Commented] (TINKERPOP-2834) CloneVertexProgram optimization on SparkGraphComputer

    [ https://issues.apache.org/jira/browse/TINKERPOP-2834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17643168#comment-17643168 ] 

ASF GitHub Bot commented on TINKERPOP-2834:
-------------------------------------------

ministat opened a new pull request, #1885:
URL: https://github.com/apache/tinkerpop/pull/1885

   The current CloneVertexProgram does nothing in its execute method, and the SparkGraphComputer needs to run general VertexProgram which requires a shuffle stage, which can be removed. Here a shortcut is implemented. When I exported two big graph, the overall exporting time was improved a lot. See the following table. 
   ```
   -----------------------------
              |Graph 1 |Graph 2
   -----------------------------
   Before fix |3.6h    |22min
   -----------------------------
   After fix  |2.4h    |16min
   ```
   Graph 1 has 15 billion vertice and 23 billion edges. Graph 2 has 130 million vertices and 650 million edges.




> CloneVertexProgram optimization on SparkGraphComputer
> -----------------------------------------------------
>
>                 Key: TINKERPOP-2834
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-2834
>             Project: TinkerPop
>          Issue Type: Improvement
>          Components: hadoop
>            Reporter: Redriver
>            Priority: Major
>
> The CloneVertexProgram does nothing in its execute() method, but in SparkGraphComputer it has to process as standard GraphComputer semantics, which takes many unnecessary computation. In fact, registering a special SparkVertexProgramInterceptor with empty apply() can improve the overall performance a lot.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)