You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@james.apache.org by GitBox <gi...@apache.org> on 2021/08/07 00:23:05 UTC

[GitHub] [james-project] chibenwa commented on pull request #570: JAMES-3623 Provide a (multi-DC firendly) Distributed POP3 Application

chibenwa commented on pull request #570:
URL: https://github.com/apache/james-project/pull/570#issuecomment-894576161


   # Why
   
   James POP3 implementation is backed by the IMAP UID, a monotic counter. Cassandra implementation uses LightWeight transactions to back a compare and swap.
   
   This implementation have expensive run time costs, especially in a multi data center setup (LWTs requires 4 round trips accross replica even for reads).
   
   # How
   
   We should contribute to apache/james-project an alternative implementation of the POP3 server not
   leveraging UIDs but instead using messageIds, enabling safe, easy to configure multi-datacenter POP3 setup for the Distributed Server.
   
   Use a dedicated view in order to list the messageIds within a mailbox and the size of the messages and use
   the messageIdManager to retrieve the given emails.
   
   Offer a configuration option to choose between the classic 'uid based' implementation or the 'messageId based' implementation via the mechanism of module-choosing.
   
   #  Consequences
   
   By implementing this we will have more options in the face of bad Lightweight Transaction performances:
   
    - Use of `-Dcassandra.unsafe.disable-serial-reads-linearizability=true` option with Cassandra would be acceptable for POP3 workload but it would lead to data loss in IMAP.
    - More aggressivley the tailor made server could include a UID/ModSeq allocation compare and swap mechanism not being backed by LWTs. Trivialy exposed to data races, it 
   simulates in the absence of concurrency a correct behaviour without the costs of LightWeight transactions and could timely be setted up. It would lead to data loss in IMAP.
   
   Also POP3 workload would not need a failover script to increment UIDs upon failover. However not playing it timely would result in data loss with IMAP.
   
   All places currently relying on IMAP UID as an identifier will not be able to be relied upon. Impact studies shows that, for some tasks like mail re-indexing will be impacted (as they are backed by IMAP UID),  which is likely non critical for a pure POP3 usage.
   
   # Tests
   
   The following changeset had been tested with Thunderbird (for the Cassandra set up)
   
   We had been contributing extensive integration tests for the POP3 servers.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@james.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@james.apache.org
For additional commands, e-mail: notifications-help@james.apache.org