You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@giraph.apache.org by "Maja Kabiljo (JIRA)" <ji...@apache.org> on 2013/02/08 00:25:12 UTC

[jira] [Updated] (GIRAPH-508) Increase the limit on the number of partitions

     [ https://issues.apache.org/jira/browse/GIRAPH-508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Maja Kabiljo updated GIRAPH-508:
--------------------------------

    Attachment: GIRAPH-508.diff
    
> Increase the limit on the number of partitions
> ----------------------------------------------
>
>                 Key: GIRAPH-508
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-508
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Maja Kabiljo
>            Assignee: Maja Kabiljo
>         Attachments: GIRAPH-508.diff
>
>
> We have the limit for total number of partitions of 2995. This is because of Zookeeper znode limit of 1MB, and from the assumption that partition owner description can take 300 bytes.
> In the simplest case, when checkpointing is not used and partitions don't move around, we have 5 ints and hostname written per partition. If partitions move around we have one more hostname and 2 ints. And when checkpointing is used we also have the path to checkpoint file written.
> For now, we can get rid of whole WorkerInfo description per partition, and just use taskIds, since all WorkerInfos are written in the beginning. This will lead to having just 4 ints per partition in the case when checkpointing is not used, and allow us to have much more partitions.
> When checkpointing is used, we can keep the limit (still up it a bit), or have all workers read partition metadata when restarting from checkpoint.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira