You are viewing a plain text version of this content. The canonical link for it is here.
Posted to server-dev@james.apache.org by "Norman Maurer (JIRA)" <ji...@apache.org> on 2011/06/15 20:40:48 UTC
[jira] [Issue Comment Edited] (MAILBOX-44) [gsoc2011] Design and
implement a distributed mailbox using Hadoop
[ https://issues.apache.org/jira/browse/MAILBOX-44?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13049949#comment-13049949 ]
Norman Maurer edited comment on MAILBOX-44 at 6/15/11 6:40 PM:
---------------------------------------------------------------
@Stack:
Hope this makes it more clear:
messagesMetaData(CF): {
mailboxId/uid: {
uid: 1,
mailboxId: 184e-ske1-igk2-gj71
flags.recent: true,
flags.deleted: true,
flags.seen: true,
flags.deleted: false,
flags.seen: false,
flags.flagged: true,
bodyOctets: 19484
fullContentOctets: 10304
properties: namespace::localname::value;;namespace2::localname2::value2
headers: byte[],
mediaType: text,
subType: plain,
textualLineCount: 24
}
}
messagesContent(CF): {
mailboxId/uid: {
1: byte[],
2: byte[],
3: byte[]
}
}
Then I have secondary indexes on the messagesMetaData CF to be able to get all messages which belongs to mailbox X and have the deleted flag set etc.
I used RP and used the secondary indexes for "filter" the right messages.
Does it explain it a bit more ?
was (Author: norman):
@Stack:
Hope this makes it more clear:
messagesMetaData(CF): {
mailboxId/uid: {
uid: 1,
mailboxId: 184e-ske1-igk2-gj71
flags.recent: true,
flags.deleted: true,
flags.seen: true,
flags.deleted: false,
flags.seen: false,
flags.flagged: true,
bodyOctets: 19484
fullContentOctets: 10304
properties: namespace::localname::value;;namespace2::localname2::value2
headers: byte[],
mediaType: text,
subType: plain,
textualLineCount: 24
}
}
messagesContent(CF): {
mailboxId/uid: {
1: byte[],
2: byte[],
3: byte[]
}
}
The I have secondary indexes on the messagesMetaData CF to be able to get all messages which belongs to mailbox X and have the deleted flag set etc.
I used RP and used the secondary indexes for "filter" the right messages.
Does it explain it a bit more ?
> [gsoc2011] Design and implement a distributed mailbox using Hadoop
> ------------------------------------------------------------------
>
> Key: MAILBOX-44
> URL: https://issues.apache.org/jira/browse/MAILBOX-44
> Project: James Mailbox
> Issue Type: New Feature
> Reporter: Eric Charles
> Assignee: Norman Maurer
> Labels: gsoc2011
> Fix For: 0.3
>
>
> Context: The mailbox subproject (http://james.apache.org/mailbox/) supports maildir, SQL database (via JPA) and Java Content Repository (JCR) as technology for mail storage. This flexibility is achieved thanks to a API design that abstracts mail storage from the mail protocols.
> Task: We need to implement mailbox storage as a distributed system on top of Hadoop HDFS. The James mailbox API will be used. A first step is to design how to interact with Hadoop (native api, gora incubator at apache,...) and deal with specific performance questions related to mail loading/parsing in a distributed system (use map/reduce or not, use existing local lucene indexes for search,...). The second step is to implement the HDFS mailbox (maildir mailbox is similar because is stores mails as a file and can be an inspiration). A single James server will still be deployed because we don't have any distributed UID generation.
> Mentor: eric at apache dot org
> Complexity: medium
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org