You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@james.apache.org by GitBox <gi...@apache.org> on 2021/07/21 08:28:20 UTC

[GitHub] [james-project] chibenwa opened a new pull request #544: JAMES-3544 [ADR] Privacy: deletion of JMAP uploads

chibenwa opened a new pull request #544:
URL: https://github.com/apache/james-project/pull/544


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@james.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@james.apache.org
For additional commands, e-mail: notifications-help@james.apache.org


[GitHub] [james-project] chibenwa merged pull request #544: JAMES-3544 [ADR] Privacy: deletion of JMAP uploads

Posted by GitBox <gi...@apache.org>.
chibenwa merged pull request #544:
URL: https://github.com/apache/james-project/pull/544


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@james.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@james.apache.org
For additional commands, e-mail: notifications-help@james.apache.org


[GitHub] [james-project] Arsnael commented on a change in pull request #544: JAMES-3544 [ADR] Privacy: deletion of JMAP uploads

Posted by GitBox <gi...@apache.org>.
Arsnael commented on a change in pull request #544:
URL: https://github.com/apache/james-project/pull/544#discussion_r673782765



##########
File path: src/adr/0048-cleanup-jmap-uploads.md
##########
@@ -0,0 +1,85 @@
+# 48. Cleanup of JMAP uploads
+
+Date: 2021-07-21
+
+## Status
+
+Accepted (lazy consensus).
+
+Not yet implemented.
+
+## Context
+
+JMAP allow users to upload binary content called blobs to be later referenced via method calls. This includes but is not

Review comment:
       ```suggestion
   JMAP allows users to upload binary content called blobs to be later referenced via method calls. This includes but is not
   ```

##########
File path: src/adr/0048-cleanup-jmap-uploads.md
##########
@@ -0,0 +1,85 @@
+# 48. Cleanup of JMAP uploads
+
+Date: 2021-07-21
+
+## Status
+
+Accepted (lazy consensus).
+
+Not yet implemented.
+
+## Context
+
+JMAP allow users to upload binary content called blobs to be later referenced via method calls. This includes but is not
+limited to `Email/set` for specifying the blobId of attachments and `Email/import`.
+
+The [specification](https://jmap.io/spec-core.html#binary-data) strongly encourages enforcing the cleanup of these uploads:
+
+```
+A blob that is not referenced by a JMAP object (e.g., as a message attachment) MAY be deleted by the server to free up 
+resources. Uploads (see below) are initially unreferenced blobs.
+
+[...] An unreferenced blob MUST NOT be deleted for at least 1 hour from the time of upload; if reuploaded, the same 
+blobId MAY be returned, but this SHOULD reset the expiry time.
+```
+
+Deleting such uploads in a timely manner is important as:
+
+ - It enables freeing server resources.
+ - failing to do so may compromise privacy: content the user have uploaded and long forgotten might still be accessible
+ in the underlying data-store. Failing to delete uploads in a timely fashion may jeopardize for instance GDPR compliance.
+ 
+Today, uploads are stored along side email attachments. This means:
+ - We can hardly apply a specific lifecycle that cleans up uploads, as distinguishing attachment from uploads is not 
+ trivial.
+ - We currently have a complex right resolution system on attachment, handling both the upload case (were the attachment
+ is linked to a user) and the 'true' attachment case (linked to a message, those who can access the message can access 
+ the attachment). This leads to sub-optimal code (slow).
+
+## Decision
+
+We need to create a separate interface `UploadRepository` in `data-jmap` to store uploads for each user. We would provide a memory 
+implementation as well as a distributed implementation of it.
+
+The distributed implementation would host metadata of the upload in Cassandra, and the content using the BlobStore API,
+so object storage.
+
+This `UploadRepository` would be used by JMAP RFC-8620 to back uploads (instead of the attachment manager), we will 
+provide a `BlobResolver` to enable interactions with the uploaded blob. Similarly, we will use the `UploadRepository` to
+back uploads of JMAP draft.
+
+We will implement cleanup of the distributed `UploadRepository`. This will be done via:
+ - TTLs on the Cassandra metadata.
+ - Organisation of the blobs in time ranged buckets, only the two most recent buckets are kept.
+ - A WebAdmin endpoint would allow to plan a CRON triggering the cleanup.
+
+## Consequences
+
+Upon migrating to the `UploadRepository`, previous uploads will not be carried other. No migration plan is provided as 
+the impact is minimal. Upload prior this change will never be cleaned up. This is acceptable as JMAP implementation are 

Review comment:
       ```suggestion
   the impact is minimal. Upload prior this change will never be cleaned up. This is acceptable as JMAP implementations are
   ```

##########
File path: src/adr/0048-cleanup-jmap-uploads.md
##########
@@ -0,0 +1,85 @@
+# 48. Cleanup of JMAP uploads
+
+Date: 2021-07-21
+
+## Status
+
+Accepted (lazy consensus).
+
+Not yet implemented.
+
+## Context
+
+JMAP allow users to upload binary content called blobs to be later referenced via method calls. This includes but is not
+limited to `Email/set` for specifying the blobId of attachments and `Email/import`.
+
+The [specification](https://jmap.io/spec-core.html#binary-data) strongly encourages enforcing the cleanup of these uploads:
+
+```
+A blob that is not referenced by a JMAP object (e.g., as a message attachment) MAY be deleted by the server to free up 
+resources. Uploads (see below) are initially unreferenced blobs.
+
+[...] An unreferenced blob MUST NOT be deleted for at least 1 hour from the time of upload; if reuploaded, the same 
+blobId MAY be returned, but this SHOULD reset the expiry time.
+```
+
+Deleting such uploads in a timely manner is important as:
+
+ - It enables freeing server resources.
+ - failing to do so may compromise privacy: content the user have uploaded and long forgotten might still be accessible
+ in the underlying data-store. Failing to delete uploads in a timely fashion may jeopardize for instance GDPR compliance.
+ 
+Today, uploads are stored along side email attachments. This means:
+ - We can hardly apply a specific lifecycle that cleans up uploads, as distinguishing attachment from uploads is not 
+ trivial.
+ - We currently have a complex right resolution system on attachment, handling both the upload case (were the attachment
+ is linked to a user) and the 'true' attachment case (linked to a message, those who can access the message can access 
+ the attachment). This leads to sub-optimal code (slow).
+
+## Decision
+
+We need to create a separate interface `UploadRepository` in `data-jmap` to store uploads for each user. We would provide a memory 
+implementation as well as a distributed implementation of it.
+
+The distributed implementation would host metadata of the upload in Cassandra, and the content using the BlobStore API,
+so object storage.
+
+This `UploadRepository` would be used by JMAP RFC-8620 to back uploads (instead of the attachment manager), we will 
+provide a `BlobResolver` to enable interactions with the uploaded blob. Similarly, we will use the `UploadRepository` to
+back uploads of JMAP draft.
+
+We will implement cleanup of the distributed `UploadRepository`. This will be done via:
+ - TTLs on the Cassandra metadata.
+ - Organisation of the blobs in time ranged buckets, only the two most recent buckets are kept.
+ - A WebAdmin endpoint would allow to plan a CRON triggering the cleanup.
+
+## Consequences
+
+Upon migrating to the `UploadRepository`, previous uploads will not be carried other. No migration plan is provided as 

Review comment:
       ```suggestion
   Upon migrating to the `UploadRepository`, previous uploads will not be carried over. No migration plan is provided as 
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@james.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@james.apache.org
For additional commands, e-mail: notifications-help@james.apache.org