You are viewing a plain text version of this content. The canonical link for it is here.
Posted to server-dev@james.apache.org by "Ioan Eugen Stan (Commented) (JIRA)" <ji...@apache.org> on 2012/01/27 12:09:40 UTC
[jira] [Commented] (MAILBOX-103) [gsoc2011] Design and implement
Distributed UID generation
[ https://issues.apache.org/jira/browse/MAILBOX-103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194605#comment-13194605 ]
Ioan Eugen Stan commented on MAILBOX-103:
-----------------------------------------
We can use ZooKeeper to implement this. Full thread: http://mail-archives.apache.org/mod_mbox/zookeeper-user/201201.mbox/%3CCAFvdMiCeMRxJaRg56zAFMRQSMB_oxRMzAYJ7e%3DJOQVf94Wscdg%40mail.gmail.com%3E
Use plain ZooKeeper and rely on znode version for sequence generation for both UID's and ModSeq.
This should scale very well with a single Zk ensemble to the number of
millions. After that we can use multiple Zk ensembles where each
ensemble should manage a shard of the mailboxes.
The first thing that comes to mind is the way Debian stores packages
[3], where they use the first letter of the package as a directory to
group all packages that start with the same name into a single
directory.
This way we can make an ensemble handle all mailboxes that start with
0-4 and another that handles 5-9. This way, considering the mailboxes
are generated uniformly, we can split the load in half so we have
horizontal scalability.
[1] http://zookeeper.apache.org/doc/current/zookeeperOver.html#fg_zkPerfReliability
[2] http://wiki.apache.org/hadoop/ZooKeeper/ServiceLatencyOverview
[3] ftp://ftp.be.debian.org/debian/pool/main
> [gsoc2011] Design and implement Distributed UID generation
> ----------------------------------------------------------
>
> Key: MAILBOX-103
> URL: https://issues.apache.org/jira/browse/MAILBOX-103
> Project: James Mailbox
> Issue Type: New Feature
> Components: hbase
> Affects Versions: 0.4
> Reporter: Eric Charles
> Fix For: 0.4
>
>
> Context: IMAP4rev1 (RFC3501 requires that every message is identified by a stable 32-bit Unique Identifier (UID) assigned in incremental sequence. This is now achieved in James IMAP subproject (http://james.apache.org/imap) with a UidProvider interface implemented in memory. This implementation does not allow distributed working of the solution.
> Task: A DistributedUidProvider must be designed. The design can rely on a distributed memory cache such as hazelcast , or any other solution (hadoop, hbase, cassandra,...), and implemented.
> Mentor: eric at apache dot org
> Complexity: medium
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org