You are viewing a plain text version of this content. The canonical link for it is here.
Posted to server-dev@james.apache.org by "Benoit Tellier (Jira)" <se...@james.apache.org> on 2020/06/18 04:31:00 UTC

[jira] [Closed] (JAMES-3148) Cassandra mailbox deletion cleanup

     [ https://issues.apache.org/jira/browse/JAMES-3148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Benoit Tellier closed JAMES-3148.
---------------------------------
    Resolution: Fixed

Merged

> Cassandra mailbox deletion cleanup
> ----------------------------------
>
>                 Key: JAMES-3148
>                 URL: https://issues.apache.org/jira/browse/JAMES-3148
>             Project: James Server
>          Issue Type: New Feature
>          Components: cassandra, mailbox
>    Affects Versions: 3.5.0
>            Reporter: Benoit Tellier
>            Priority: Major
>             Fix For: 3.6.0
>
>
> Cassandra is used within distributed James product to hold messages and mailboxes metadata.
> Cassandra holds the following tables:
>  - mailboxPathV2 + mailbox allowing to retrieve mailboxes informations
>  - acl + UserMailboxACL holds denormalized information
>  - messageIdTable & imapUidTable allows to retrieve mailbox context information
>  - messageV2 table holds message matadata
>  - attachmentV2 holds attachment for messages
>  - References to these attachments are contained within the attachmentOwner and attachmentMessageId tables
>  
> Currently, the deletion only deletes the first level of metadata. Lower level metadata stay unreachable. The data looks 
> deleted but references are actually still present.
> Concretely:
>  - Upon mailbox deletion, only mailboxPathV2 & mailbox content is deleted. messageIdTable, imapUidTable, messageV2, 
>  attachmentV2 & attachmentMessageId metadata is left undeleted.
>  - Upon mailbox deletion, acl + UserMailboxACL is not deleted.
>  - Upon message deletion, only messageIdTable & imapUidTable content is deleted. messageV2, attachmentV2 & 
>  attachmentMessageId metadata is left undeleted.
> This jeopardize efforts to regain disk space and privacy, for example through blobStore garbage collection.
> We need to cleanup Cassandra metadata. They can be retrieved from dandling metadata after the delete operation had been 
> conducted out. We need to delete the lower levels first so that upon failures undeleted metadata can still be reached.
> This cleanup is not needed for strict correctness from a MailboxManager point of view thus it could be carried out 
> asynchronously, via mailbox listeners so that it can be retried.
> Mailbox listener failures leads to eventBus retrying their execution, we need to ensure the result of the deletion to be 
> idempotent. This might have consequences on the blobStore garbage collection design.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org