You are viewing a plain text version of this content. The canonical link for it is here.
Posted to server-dev@james.apache.org by "Eric Charles (JIRA)" <ji...@apache.org> on 2011/05/24 15:51:47 UTC

[jira] [Created] (MAILBOX-72) Requirements for a distributed mailbox implementation

Requirements for a distributed mailbox implementation
-----------------------------------------------------

                 Key: MAILBOX-72
                 URL: https://issues.apache.org/jira/browse/MAILBOX-72
             Project: James Mailbox
          Issue Type: New Feature
            Reporter: Eric Charles
            Assignee: Norman Maurer


This JIRA will collect some generic technical requirements regarding a distributed mailbox implementation, whatever the implementation  technology is.
The implementator is responsible to enforce those requirements.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org


[jira] [Commented] (MAILBOX-72) Requirements for a distributed mailbox implementation

Posted by "Eric Charles (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAILBOX-72?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13049883#comment-13049883 ] 

Eric Charles commented on MAILBOX-72:
-------------------------------------

To design a datamodel, it is important to realize the kind of queries we will have to support.
I take here after the queries from the mailbox-jpa (SQL database).

>From AbstractJPAMessage:
@NamedQueries({
    @NamedQuery(name="findRecentMessagesInMailbox",
            query="SELECT message FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.recent = TRUE"),
    @NamedQuery(name="findUnseenMessagesInMailboxOrderByUid",
            query="SELECT message FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.seen = FALSE ORDER BY message.uid ASC"),
    @NamedQuery(name="findMessagesInMailbox",
            query="SELECT message FROM Message message WHERE message.mailbox.mailboxId = :idParam"),
    @NamedQuery(name="findMessagesInMailboxBetweenUIDs",
            query="SELECT message FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.uid BETWEEN :fromParam AND :toParam"),        
    @NamedQuery(name="findMessagesInMailboxWithUID",
            query="SELECT message FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.uid=:uidParam"),                    
    @NamedQuery(name="findMessagesInMailboxAfterUID",
            query="SELECT message FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.uid>=:uidParam"),                    
    @NamedQuery(name="findDeletedMessagesInMailbox",
            query="SELECT message.uid FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.deleted=TRUE"),        
    @NamedQuery(name="findDeletedMessagesInMailboxBetweenUIDs",
            query="SELECT message.uid FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.uid BETWEEN :fromParam AND :toParam AND message.deleted=TRUE"),        
    @NamedQuery(name="findDeletedMessagesInMailboxWithUID",
            query="SELECT message.uid FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.uid=:uidParam AND message.deleted=TRUE"),                    
    @NamedQuery(name="findDeletedMessagesInMailboxAfterUID",
            query="SELECT message.uid FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.uid>=:uidParam AND message.deleted=TRUE"),          
            
    @NamedQuery(name="deleteDeletedMessagesInMailbox",
            query="DELETE FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.deleted=TRUE"),        
    @NamedQuery(name="deleteDeletedMessagesInMailboxBetweenUIDs",
            query="DELETE FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.uid BETWEEN :fromParam AND :toParam AND message.deleted=TRUE"),        
    @NamedQuery(name="deleteDeletedMessagesInMailboxWithUID",
            query="DELETE FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.uid=:uidParam AND message.deleted=TRUE"),                    
    @NamedQuery(name="deleteDeletedMessagesInMailboxAfterUID",
            query="DELETE FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.uid>=:uidParam AND message.deleted=TRUE"),  
                    
    @NamedQuery(name="countUnseenMessagesInMailbox",
            query="SELECT COUNT(message) FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.seen=FALSE"),                     
    @NamedQuery(name="countMessagesInMailbox",
            query="SELECT COUNT(message) FROM Message message WHERE message.mailbox.mailboxId = :idParam"),                    
    @NamedQuery(name="deleteMessages",
            query="DELETE FROM Message message WHERE message.mailbox.mailboxId = :idParam"),
    @NamedQuery(name="findLastUidInMailbox",
            query="SELECT message.uid FROM Message message WHERE message.mailbox.mailboxId = :idParam ORDER BY message.uid DESC"),
    @NamedQuery(name="deleteAllMemberships",
            query="DELETE FROM Message message")

>From JPAMailbox
    @NamedQuery(name="findMailboxById",
        query="SELECT mailbox FROM Mailbox mailbox WHERE mailbox.mailbox.mailboxId = :idParam"),
    @NamedQuery(name="findMailboxByName",
        query="SELECT mailbox FROM Mailbox mailbox WHERE mailbox.name = :nameParam and mailbox.user is NULL and mailbox.namespace= :namespaceParam"),
    @NamedQuery(name="findMailboxByNameWithUser",
        query="SELECT mailbox FROM Mailbox mailbox WHERE mailbox.name = :nameParam and mailbox.user= :userParam and mailbox.namespace= :namespaceParam"),
    @NamedQuery(name="deleteAllMailboxes",
        query="DELETE FROM Mailbox mailbox"),
    @NamedQuery(name="findMailboxWithNameLikeWithUser",
        query="SELECT mailbox FROM Mailbox mailbox WHERE mailbox.name LIKE :nameParam and mailbox.user= :userParam and mailbox.namespace= :namespaceParam"),
    @NamedQuery(name="findMailboxWithNameLike",
        query="SELECT mailbox FROM Mailbox mailbox WHERE mailbox.name LIKE :nameParam and mailbox.user is NULL and mailbox.namespace= :namespaceParam"),
    @NamedQuery(name="countMailboxesWithNameLikeWithUser",
        query="SELECT COUNT(mailbox) FROM Mailbox mailbox WHERE mailbox.name LIKE :nameParam and mailbox.user= :userParam and mailbox.namespace= :namespaceParam"),
    @NamedQuery(name="countMailboxesWithNameLike",
        query="SELECT COUNT(mailbox) FROM Mailbox mailbox WHERE mailbox.name LIKE :nameParam and mailbox.user is NULL and mailbox.namespace= :namespaceParam"),
    @NamedQuery(name="listMailboxes",
        query="SELECT mailbox FROM Mailbox mailbox")

>From JPASubscription
@NamedQueries({
    @NamedQuery(name = "findFindMailboxSubscriptionForUser",
        query = "SELECT subscription FROM Subscription subscription WHERE subscription.username = :userParam AND subscription.mailbox = :mailboxParam"),          
    @NamedQuery(name = "findSubscriptionsForUser",
        query = "SELECT subscription FROM Subscription subscription WHERE subscription.username = :userParam")                  
})


> Requirements for a distributed mailbox implementation
> -----------------------------------------------------
>
>                 Key: MAILBOX-72
>                 URL: https://issues.apache.org/jira/browse/MAILBOX-72
>             Project: James Mailbox
>          Issue Type: New Feature
>            Reporter: Eric Charles
>            Assignee: Norman Maurer
>         Attachments: Datamodel-mailbox-0.2.png
>
>
> This JIRA will collect some generic technical requirements regarding a distributed mailbox implementation, whatever the implementation  technology is.
> The implementator is responsible to enforce those requirements.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org


[jira] [Updated] (MAILBOX-72) Requirements for a distributed mailbox implementation

Posted by "Eric Charles (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAILBOX-72?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Eric Charles updated MAILBOX-72:
--------------------------------

    Attachment: Datamodel-mailbox-0.2.png

The attached image contains the needed domain classes that must be implemented by the store:
- Mailbox (with id, namespace, user, name, uidValidity)
- Subscription (with mailbox, user)
- Message (with date, mailboxid, uid, flags, fullcontent, bodycontent, mediatype, headers, properties,...)
- Header (with fieldname, linenumber, value)
- Property (with namespace, localname, value)

> Requirements for a distributed mailbox implementation
> -----------------------------------------------------
>
>                 Key: MAILBOX-72
>                 URL: https://issues.apache.org/jira/browse/MAILBOX-72
>             Project: James Mailbox
>          Issue Type: New Feature
>            Reporter: Eric Charles
>            Assignee: Norman Maurer
>         Attachments: Datamodel-mailbox-0.2.png
>
>
> This JIRA will collect some generic technical requirements regarding a distributed mailbox implementation, whatever the implementation  technology is.
> The implementator is responsible to enforce those requirements.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org


[jira] [Commented] (MAILBOX-72) Requirements for a distributed mailbox implementation

Posted by "ioan eugen stan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAILBOX-72?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13041169#comment-13041169 ] 

ioan eugen stan commented on MAILBOX-72:
----------------------------------------

This is a summary of a discution available on the server-dev@james.apache.org mailing list

For best results in implementing a mailbox storage over a distributed environment (namely HBase/HDFS), the following things must be taken in consideration:

- mailbox (immutable: create/read/delete/query)
- message (immutable: create/read/delete/query)

- message flags (create/read/update/delete/query)
- subscriptions (create/read/update/delete/query)

Important things regarding HBase:

- cells are versioned
- rows are sorted by row key - very important
- column families are physically stored in the same place and they should have the same access pattern (just read, or read/write)
- all column families must be created with the table
- columns may be added on the fly to column families.

Some useful tips for choosing keys and column names:

- you can use reverse domain name to keep things in a proper sorted order (like org.apache@username)
- you can use reverse order time stamp (Long.MAX_VALUE - epoch) to keep the newest records first (get the latest emails first).
- use binary data instead of string representation if key is integer numeric value.

A paper detailing a sample data schema for Cassandra is available in [1]. 

Some reading regarding about Big data column stores:

[1] http://ewh.ieee.org/r6/scv/computer/nfic/2009/IBM-Jun-Rao.pdf
[2] Hadoop the definitive guide second edition - The HBase chapter. 
[3] http://db.csail.mit.edu/projects/cstore/abadicidr07.pdf
[4] http://en.wikipedia.org/wiki/Column-oriented_DBMS


> Requirements for a distributed mailbox implementation
> -----------------------------------------------------
>
>                 Key: MAILBOX-72
>                 URL: https://issues.apache.org/jira/browse/MAILBOX-72
>             Project: James Mailbox
>          Issue Type: New Feature
>            Reporter: Eric Charles
>            Assignee: Norman Maurer
>
> This JIRA will collect some generic technical requirements regarding a distributed mailbox implementation, whatever the implementation  technology is.
> The implementator is responsible to enforce those requirements.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org