You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tinkerpop.apache.org by "Matthias Broecheler (JIRA)" <ji...@apache.org> on 2015/10/14 23:49:05 UTC

[jira] [Created] (TINKERPOP3-889) Support for partitioned vertices in GraphComputer

Matthias Broecheler created TINKERPOP3-889:
----------------------------------------------

             Summary: Support for partitioned vertices in GraphComputer
                 Key: TINKERPOP3-889
                 URL: https://issues.apache.org/jira/browse/TINKERPOP3-889
             Project: TinkerPop 3
          Issue Type: New Feature
            Reporter: Matthias Broecheler
            Priority: Critical


Most natural graphs have scale free distributions which means that some vertices in the graph have significantly more incident edges than others. On large graphs, it is therefore not uncommon to encounter vertices whose adjacency list is to large to be accommodated efficiently by a single machine (due to lack of sufficient memory or because they create a hotspot).
Other graph computing frameworks have successfully addressed this "supernode problem" by partitioning the vertex's adjacency list and processing each subset of the adjacency list separately because merging the results (e.g. counting edges in each subset and then adding those values to get the total degree). TinkerPop should implement such functionality and allow partitioned vertex adjacency lists in the input to GraphComputer. This is a critical feature to make TP applicable to large graph computations.

This can be implemented fairly easily for local messages using the message combiners. Global messages can be tricky however. See also TINKERPOP3-383. This has been partially implemented in Titan's Fulgora package (https://github.com/thinkaurelius/titan/tree/titan10/titan-core/src/main/java/com/thinkaurelius/titan/graphdb/olap).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)