You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@giraph.apache.org by "Brian Femiano (JIRA)" <ji...@apache.org> on 2013/03/06 06:52:12 UTC

[jira] [Created] (GIRAPH-552) HBaseVertexInputFormat is ignoring region locality on input superstep

Brian Femiano created GIRAPH-552:
------------------------------------

             Summary: HBaseVertexInputFormat is ignoring region locality on input superstep
                 Key: GIRAPH-552
                 URL: https://issues.apache.org/jira/browse/GIRAPH-552
             Project: Giraph
          Issue Type: Bug
          Components: graph
    Affects Versions: 0.2.0
            Reporter: Brian Femiano


During the input superstep, you can see the data for different regions being needlessly transferred across the network, instead of giving preference to machine-local regions if available. 

On modest to large size graphs (5mil V 10mil E) we've noticed this causing resource contention, Zookeeper timeouts, and other issues that often freeze the input superstep until manually killed on the task tracker hosts. 

This doesn't happen for TextVertexInputFormat subclasses. Perhaps it has to do with each instance of the HBaseVertexInputFormat subclass delegating to a private TableInputFormat instance. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira