You are viewing a plain text version of this content. The canonical link for it is here.

Posted to server-dev@james.apache.org by "Tellier Benoit (JIRA)" <se...@james.apache.org> on 2019/08/02 14:46:00 UTC

[jira] [Created] (JAMES-2852) Optimizing CassandraBlobStore deleteBucket

Tellier Benoit created JAMES-2852:
-------------------------------------

             Summary: Optimizing CassandraBlobStore deleteBucket
                 Key: JAMES-2852
                 URL: https://issues.apache.org/jira/browse/JAMES-2852
             Project: James Server
          Issue Type: Improvement
          Components: Blob, cassandra
            Reporter: Tellier Benoit


Currently CassandraBlobStore needs to iterate on all blobs of a current bucket in order to delete a bucket.

This was our design considerations:

We avoided "wide row" issue - many blobs being stored in the same buckets the maximum size of a cell would have been exceeded - and optimize data repartition in a cluster. For these reasons, we had to choose a primary key that has a finner granularity than just the bucket - we choosed to rely on the bucket and the object identifier. This leads to a slow operation upon deleting bucket as all blobns not in default bucket needs to be iterated on.

The only usage so far is the vault, which currently relies on 13 buckets, hence the over-head introduced is reasonable.

However, this cost will increase as we expand our usage of buckets.

Later on, we could introduce a time serie for retrieving easily blobs stored in a bucket and avoiding iterating non related blobs.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org