You are viewing a plain text version of this content. The canonical link for it is here.
Posted to server-dev@james.apache.org by "Eric Charles (JIRA)" <ji...@apache.org> on 2011/05/24 15:51:47 UTC
[jira] [Created] (MAILBOX-72) Requirements for a distributed
mailbox implementation
Requirements for a distributed mailbox implementation
-----------------------------------------------------
Key: MAILBOX-72
URL: https://issues.apache.org/jira/browse/MAILBOX-72
Project: James Mailbox
Issue Type: New Feature
Reporter: Eric Charles
Assignee: Norman Maurer
This JIRA will collect some generic technical requirements regarding a distributed mailbox implementation, whatever the implementation technology is.
The implementator is responsible to enforce those requirements.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org
[jira] [Commented] (MAILBOX-72) Requirements for a distributed
mailbox implementation
Posted by "Eric Charles (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAILBOX-72?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13049883#comment-13049883 ]
Eric Charles commented on MAILBOX-72:
-------------------------------------
To design a datamodel, it is important to realize the kind of queries we will have to support.
I take here after the queries from the mailbox-jpa (SQL database).
>From AbstractJPAMessage:
@NamedQueries({
@NamedQuery(name="findRecentMessagesInMailbox",
query="SELECT message FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.recent = TRUE"),
@NamedQuery(name="findUnseenMessagesInMailboxOrderByUid",
query="SELECT message FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.seen = FALSE ORDER BY message.uid ASC"),
@NamedQuery(name="findMessagesInMailbox",
query="SELECT message FROM Message message WHERE message.mailbox.mailboxId = :idParam"),
@NamedQuery(name="findMessagesInMailboxBetweenUIDs",
query="SELECT message FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.uid BETWEEN :fromParam AND :toParam"),
@NamedQuery(name="findMessagesInMailboxWithUID",
query="SELECT message FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.uid=:uidParam"),
@NamedQuery(name="findMessagesInMailboxAfterUID",
query="SELECT message FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.uid>=:uidParam"),
@NamedQuery(name="findDeletedMessagesInMailbox",
query="SELECT message.uid FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.deleted=TRUE"),
@NamedQuery(name="findDeletedMessagesInMailboxBetweenUIDs",
query="SELECT message.uid FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.uid BETWEEN :fromParam AND :toParam AND message.deleted=TRUE"),
@NamedQuery(name="findDeletedMessagesInMailboxWithUID",
query="SELECT message.uid FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.uid=:uidParam AND message.deleted=TRUE"),
@NamedQuery(name="findDeletedMessagesInMailboxAfterUID",
query="SELECT message.uid FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.uid>=:uidParam AND message.deleted=TRUE"),
@NamedQuery(name="deleteDeletedMessagesInMailbox",
query="DELETE FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.deleted=TRUE"),
@NamedQuery(name="deleteDeletedMessagesInMailboxBetweenUIDs",
query="DELETE FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.uid BETWEEN :fromParam AND :toParam AND message.deleted=TRUE"),
@NamedQuery(name="deleteDeletedMessagesInMailboxWithUID",
query="DELETE FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.uid=:uidParam AND message.deleted=TRUE"),
@NamedQuery(name="deleteDeletedMessagesInMailboxAfterUID",
query="DELETE FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.uid>=:uidParam AND message.deleted=TRUE"),
@NamedQuery(name="countUnseenMessagesInMailbox",
query="SELECT COUNT(message) FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.seen=FALSE"),
@NamedQuery(name="countMessagesInMailbox",
query="SELECT COUNT(message) FROM Message message WHERE message.mailbox.mailboxId = :idParam"),
@NamedQuery(name="deleteMessages",
query="DELETE FROM Message message WHERE message.mailbox.mailboxId = :idParam"),
@NamedQuery(name="findLastUidInMailbox",
query="SELECT message.uid FROM Message message WHERE message.mailbox.mailboxId = :idParam ORDER BY message.uid DESC"),
@NamedQuery(name="deleteAllMemberships",
query="DELETE FROM Message message")
>From JPAMailbox
@NamedQuery(name="findMailboxById",
query="SELECT mailbox FROM Mailbox mailbox WHERE mailbox.mailbox.mailboxId = :idParam"),
@NamedQuery(name="findMailboxByName",
query="SELECT mailbox FROM Mailbox mailbox WHERE mailbox.name = :nameParam and mailbox.user is NULL and mailbox.namespace= :namespaceParam"),
@NamedQuery(name="findMailboxByNameWithUser",
query="SELECT mailbox FROM Mailbox mailbox WHERE mailbox.name = :nameParam and mailbox.user= :userParam and mailbox.namespace= :namespaceParam"),
@NamedQuery(name="deleteAllMailboxes",
query="DELETE FROM Mailbox mailbox"),
@NamedQuery(name="findMailboxWithNameLikeWithUser",
query="SELECT mailbox FROM Mailbox mailbox WHERE mailbox.name LIKE :nameParam and mailbox.user= :userParam and mailbox.namespace= :namespaceParam"),
@NamedQuery(name="findMailboxWithNameLike",
query="SELECT mailbox FROM Mailbox mailbox WHERE mailbox.name LIKE :nameParam and mailbox.user is NULL and mailbox.namespace= :namespaceParam"),
@NamedQuery(name="countMailboxesWithNameLikeWithUser",
query="SELECT COUNT(mailbox) FROM Mailbox mailbox WHERE mailbox.name LIKE :nameParam and mailbox.user= :userParam and mailbox.namespace= :namespaceParam"),
@NamedQuery(name="countMailboxesWithNameLike",
query="SELECT COUNT(mailbox) FROM Mailbox mailbox WHERE mailbox.name LIKE :nameParam and mailbox.user is NULL and mailbox.namespace= :namespaceParam"),
@NamedQuery(name="listMailboxes",
query="SELECT mailbox FROM Mailbox mailbox")
>From JPASubscription
@NamedQueries({
@NamedQuery(name = "findFindMailboxSubscriptionForUser",
query = "SELECT subscription FROM Subscription subscription WHERE subscription.username = :userParam AND subscription.mailbox = :mailboxParam"),
@NamedQuery(name = "findSubscriptionsForUser",
query = "SELECT subscription FROM Subscription subscription WHERE subscription.username = :userParam")
})
> Requirements for a distributed mailbox implementation
> -----------------------------------------------------
>
> Key: MAILBOX-72
> URL: https://issues.apache.org/jira/browse/MAILBOX-72
> Project: James Mailbox
> Issue Type: New Feature
> Reporter: Eric Charles
> Assignee: Norman Maurer
> Attachments: Datamodel-mailbox-0.2.png
>
>
> This JIRA will collect some generic technical requirements regarding a distributed mailbox implementation, whatever the implementation technology is.
> The implementator is responsible to enforce those requirements.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org
[jira] [Updated] (MAILBOX-72) Requirements for a distributed
mailbox implementation
Posted by "Eric Charles (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAILBOX-72?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Eric Charles updated MAILBOX-72:
--------------------------------
Attachment: Datamodel-mailbox-0.2.png
The attached image contains the needed domain classes that must be implemented by the store:
- Mailbox (with id, namespace, user, name, uidValidity)
- Subscription (with mailbox, user)
- Message (with date, mailboxid, uid, flags, fullcontent, bodycontent, mediatype, headers, properties,...)
- Header (with fieldname, linenumber, value)
- Property (with namespace, localname, value)
> Requirements for a distributed mailbox implementation
> -----------------------------------------------------
>
> Key: MAILBOX-72
> URL: https://issues.apache.org/jira/browse/MAILBOX-72
> Project: James Mailbox
> Issue Type: New Feature
> Reporter: Eric Charles
> Assignee: Norman Maurer
> Attachments: Datamodel-mailbox-0.2.png
>
>
> This JIRA will collect some generic technical requirements regarding a distributed mailbox implementation, whatever the implementation technology is.
> The implementator is responsible to enforce those requirements.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org
[jira] [Commented] (MAILBOX-72) Requirements for a distributed
mailbox implementation
Posted by "ioan eugen stan (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAILBOX-72?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13041169#comment-13041169 ]
ioan eugen stan commented on MAILBOX-72:
----------------------------------------
This is a summary of a discution available on the server-dev@james.apache.org mailing list
For best results in implementing a mailbox storage over a distributed environment (namely HBase/HDFS), the following things must be taken in consideration:
- mailbox (immutable: create/read/delete/query)
- message (immutable: create/read/delete/query)
- message flags (create/read/update/delete/query)
- subscriptions (create/read/update/delete/query)
Important things regarding HBase:
- cells are versioned
- rows are sorted by row key - very important
- column families are physically stored in the same place and they should have the same access pattern (just read, or read/write)
- all column families must be created with the table
- columns may be added on the fly to column families.
Some useful tips for choosing keys and column names:
- you can use reverse domain name to keep things in a proper sorted order (like org.apache@username)
- you can use reverse order time stamp (Long.MAX_VALUE - epoch) to keep the newest records first (get the latest emails first).
- use binary data instead of string representation if key is integer numeric value.
A paper detailing a sample data schema for Cassandra is available in [1].
Some reading regarding about Big data column stores:
[1] http://ewh.ieee.org/r6/scv/computer/nfic/2009/IBM-Jun-Rao.pdf
[2] Hadoop the definitive guide second edition - The HBase chapter.
[3] http://db.csail.mit.edu/projects/cstore/abadicidr07.pdf
[4] http://en.wikipedia.org/wiki/Column-oriented_DBMS
> Requirements for a distributed mailbox implementation
> -----------------------------------------------------
>
> Key: MAILBOX-72
> URL: https://issues.apache.org/jira/browse/MAILBOX-72
> Project: James Mailbox
> Issue Type: New Feature
> Reporter: Eric Charles
> Assignee: Norman Maurer
>
> This JIRA will collect some generic technical requirements regarding a distributed mailbox implementation, whatever the implementation technology is.
> The implementator is responsible to enforce those requirements.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org