You are viewing a plain text version of this content. The canonical link for it is here.
Posted to server-dev@james.apache.org by "Tellier Benoit (JIRA)" <se...@james.apache.org> on 2018/05/06 14:16:00 UTC

[jira] [Updated] (JAMES-2390) JMAP attachment performance issues

     [ https://issues.apache.org/jira/browse/JAMES-2390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tellier Benoit updated JAMES-2390:
----------------------------------
    Attachment: Capture d’écran de 2018-05-06 19-35-02.png
                Capture d’écran de 2018-05-06 19-32-31.png

> JMAP attachment performance issues
> ----------------------------------
>
>                 Key: JAMES-2390
>                 URL: https://issues.apache.org/jira/browse/JAMES-2390
>             Project: James Server
>          Issue Type: New Feature
>          Components: cassandra, JMAP
>    Affects Versions: master
>            Reporter: Tellier Benoit
>            Assignee: Antoine Duprat
>            Priority: Major
>              Labels: perfomance
>         Attachments: Capture d’écran de 2018-05-06 19-32-31.png, Capture d’écran de 2018-05-06 19-35-02.png
>
>
> Most of the Cassandra failures are related to attachment downloads, and more precisely to attachment right checking.
> Having a look at attached screenshots:
>  - We can notice a lot of warnings are generated by JMAP attachment downloads.
>  - That failure happens when reading meta-data, in order to retrieve the list of referencing messages to resolve rights.
>  - Furthermore, we can notice failure is systematic for some attachments.
> I spend a bit of time this weekend analysing this (unexpected!) performance issues. I've mostly found 2 intuitive performance improvements as well as one more complex.
>  -1. Upon checking whether a set of messages is accessible, the containing mailbox rights were checks on a per-mailbox base. This is sub-optimal as some messages might be in the same mailbox, whose rights will be needlessly checked several times.
> This change inserts smoothly into the codebase, the tools for checking rights once per mailbox is already implemented. Just not used in that case.
>  - 2. Paging and asynchronous code don't combine well as already proven by previous code. The mantra is *join then collect*. If the operation is done reverse and entries exceed paging size (~5000) an exception will be thrown by the Cassandra driver.
> This explains the systematic failures for some specific attachments... The fix is trivial, and I added a test for demonstrating this.
>  - 3. The given logs suggest that we have high cardinality rows in our database (IE an attachment referenced by several messages), as the number of referencing messages exceeds 5000 (to trigger paging issues)
> Such a high cardinality has a massive read cost:
>  - Reading such a row is a complex operation
>  - Caching can not help as cache size per primary key is exceeded
>  - Rights would be resolved for each referencing messages, generating an expensive read Cascade.
> Note that deduplication is done at the Attachment level. By looking at the attachment names (cf screenshots) we can notice these "high cardinality" attachments look like inlined images in signature...
> The stand here is that deduplicating is not a concern for attachments, but for blobs. We should further push this lower level constraint in the stack. That way, each blob would be deduplicated (storage cost reduction, higher FS cache efficiency, etc...) while avoiding *wide rows*.
> We should ensure each newly generated AttachmentId is unique, then generate BlobId from the blob's content, to avoid wide rows while keeping deduplication in place.
> Note that this being done just for newly received messages, this can be done transparently, without the needs for a migration.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org