You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@james.apache.org by rc...@apache.org on 2020/10/12 07:40:15 UTC

[james-project] 08/12: JAMES-3406 Introduction & Cassandra consistency

This is an automated email from the ASF dual-hosted git repository.

rcordier pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/james-project.git

commit 07a3f41548f696e5ffe08661dc88792ce3404eed
Author: Benoit Tellier <bt...@linagora.com>
AuthorDate: Tue Oct 6 17:28:43 2020 +0700

    JAMES-3406 Introduction & Cassandra consistency
---
 .../architecture/consistency-model.adoc            | 44 ++++++++++++++++++++--
 1 file changed, 40 insertions(+), 4 deletions(-)

diff --git a/docs/modules/servers/pages/distributed/architecture/consistency-model.adoc b/docs/modules/servers/pages/distributed/architecture/consistency-model.adoc
index 1962376..2df93ac 100644
--- a/docs/modules/servers/pages/distributed/architecture/consistency-model.adoc
+++ b/docs/modules/servers/pages/distributed/architecture/consistency-model.adoc
@@ -1,19 +1,55 @@
 = Distributed James Server &mdash; Consistency Model
 :navtitle: Consistency Model
 
-TODO introduction
+This page presents the consistency model used by the Distributed Server and
+points to the tools built around it.
 
 == Data Replication
 
-TODO
+The Distributed Server relies on different storage technologies, all having their own
+consistency models.
+
+These data stores replicates data in order to enforce some level of availability. We call
+this process replication. By consistency, we mean the ability for all replica to hold the
+same data. By availability, we mean the ability for a replica to answer a request.
+
+In distributed systems, link:https://en.wikipedia.org/wiki/CAP_theorem[according to the CAP theorem],
+as we will necessarily encounter network partitions, then tradeoffs needs to be made between
+consistency and availability.
+
+This section details this tradeoff for data stores used by the Distributed Server.
 
 === Cassandra consistency model
 
-TODO
+link:https://cassandra.apache.org/[Cassandra] is an
+link:https://en.wikipedia.org/wiki/Eventual_consistency[eventually consistent] data store.
+This means that replica can hold diverging data, but are guaranteed to converge over time.
+
+Several mechanisms are built in Cassandra to enforce this convergence, and needs to be
+leveraged by *Distributed Server Administrator*. Namely
+link:https://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/tools/nodetool/toolsRepair.html[nodetool repair],
+link:https://cassandra.apache.org/doc/latest/operating/hints.html[Hinted hand-off] and
+link:https://cassandra.apache.org/doc/latest/operating/read_repair.html[Read repair].
+
+The Distributed Server tries to mitigate inconsistencies by relying on
+link:https://docs.datastax.com/en/archived/cassandra/3.0/cassandra/dml/dmlConfigConsistency.html[QUORUM] read and write levels.
+This means that a majority of replica are needed for read and write operations to be performed.
+
+Critical business operations, like UID allocation, relies on strong consistency mechanism brought by
+link:https://www.datastax.com/blog/2013/07/lightweight-transactions-cassandra-20[lightweight transaction].
 
 ==== About multi data-center setups
 
-TODO
+As strong consistency is required for some operations, and as lightweight transactions are
+slow across data centers, running James with a
+link:https://docs.datastax.com/en/ddac/doc/datastax_enterprise/production/DDACmultiDCperWorkloadType.html[multi data-center]
+Cassandra setup is discouraged.
+
+However xref:distributed/configure/cassandra.adoc[this page] enables setting alternative read level,
+which could be acceptable regarding limited requirements.
+
+Running the Distributed Server in a multi datacenter setup will likely result either in data loss,
+or very slow operations.
 
 === ElasticSearch consistency model
 


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@james.apache.org
For additional commands, e-mail: notifications-help@james.apache.org