You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@couchdb.apache.org by GitBox <gi...@apache.org> on 2018/07/19 07:07:34 UTC

[GitHub] janl commented on a change in pull request #268: Rewrite sharding documentation

janl commented on a change in pull request #268: Rewrite sharding documentation
URL: https://github.com/apache/couchdb-documentation/pull/268#discussion_r203619401
 
 

 ##########
 File path: src/cluster/sharding.rst
 ##########
 @@ -12,290 +12,498 @@
 
 .. _cluster/sharding:
 
-========
-Sharding
-========
+================
+Shard Management
+================
 
 .. _cluster/sharding/scaling-out:
 
-Scaling out
-===========
+Introduction
+------------
 
-Normally you start small and grow over time. In the beginning you might do just
-fine with one node, but as your data and number of clients grows, you need to
-scale out.
+This document discusses how sharding works in CouchDB along with how to
+safely add, move, remove, and create placement rules for shards and
+shard replicas.
 
-For simplicity we will start fresh and small.
+A `shard
+<https://en.wikipedia.org/wiki/Shard_(database_architecture)>`__ is a
+horizontal partition of data in a database. Partitioning data into
+shards and distributing copies of each shard (called "shard replicas" or
+just "replicas") to different nodes in a cluster gives the data greater
+durability against node loss. CouchDB clusters automatically shard
+databases and distribute the subsets of documents that compose each
+shard among nodes. Modifying cluster membership and sharding behavior
+must be done manually.
 
-Start ``node1`` and add a database to it. To keep it simple we will have 2
-shards and no replicas.
+Shards and Replicas
+~~~~~~~~~~~~~~~~~~~
 
-.. code-block:: bash
+How many shards and replicas each database has can be set at the global
+level, or on a per-database basis. The relevant parameters are ``q`` and
+``n``.
 
-    curl -X PUT "http://xxx.xxx.xxx.xxx:5984/small?n=1&q=2" --user daboss
+*q* is the number of database shards to maintain. *n* is the number of
+copies of each document to distribute. The default value for ``n`` is ``3``,
+and for ``q`` is ``8``. With ``q=8``, the database is split into 8 shards. With
+``n=3``, the cluster distributes three replicas of each shard. Altogether,
+that's 24 shard replicas for a single database. In a default 3-node cluster,
+each node would receive 8 shards. In a 4-node cluster, each node would
+receive 6 shards. We recommend in the general case that the number of
+nodes in your cluster should be a multiple of ``n``, so that shards are
+distributed evenly.
 
-If you look in the directory ``data/shards`` you will find the 2 shards.
+CouchDB nodes have a ``etc/default.ini`` file with a section named
+``[cluster]`` which looks like this:
 
-.. code-block:: text
+::
 
-    data/
-    +-- shards/
-    |   +-- 00000000-7fffffff/
-    |   |    -- small.1425202577.couch
-    |   +-- 80000000-ffffffff/
-    |        -- small.1425202577.couch
+    [cluster]
+    q=8
+    n=3
 
-Now, check the node-local ``_dbs_`` database. Here, the metadata for each
-database is stored. As the database is called ``small``, there is a document
-called ``small`` there. Let us look in it. Yes, you can get it with curl too:
+These settings can be modified to set sharding defaults for all
+databases, or they can be set on a per-database basis by specifying the
+``q`` and ``n`` query parameters when the database is created. For
+example:
 
-.. code-block:: javascript
+.. code:: bash
 
-    curl -X GET "http://xxx.xxx.xxx.xxx:5986/_dbs/small"
+    $ curl -X PUT "$COUCH_URL/database-name?q=4&n=2"
 
+That creates a database that is split into 4 shards and 2 replicas,
+yielding 8 shard replicas distributed throughout the cluster.
+
+Quorum
+~~~~~~
+
+Depending on the size of the cluster, the number of shards per database,
+and the number of shard replicas, not every node may have access to
+every shard, but every node knows where all the replicas of each shard
+can be found through CouchDB's internal shard map.
+
+Each request that comes in to a CouchDB cluster is handled by any one
+random coordinating node. This coordinating node proxies the request to
+the other nodes that have the relevant data, which may or may not
+include itself. The coordinating node sends a response to the client
+once a `quorum
+<https://en.wikipedia.org/wiki/Quorum_(distributed_computing)>`__ of
+database nodes have responded; 2, by default.
+
+The size of the required quorum can be configured at request time by
+setting the ``r`` parameter for document and view reads, and the ``w``
+parameter for document writes. For example, here is a request that
+directs the coordinating node to send a response once at least two nodes
+have responded:
+
+.. code:: bash
+
+    $ curl "$COUCH_URL:5984/<doc>?r=2"
+
+Here is a similar example for writing a document:
+
+.. code:: bash
+
+    $ curl -X PUT "$COUCH_URL:5984/<doc>?w=2" -d '{...}'
+
+Setting ``r`` or ``w`` to be equal to ``n`` (the number of replicas)
+means you will only receive a response once all nodes with relevant
+shards have responded or timed out, and as such this approach does not
+guarantee `ACIDic consistency
+<https://en.wikipedia.org/wiki/ACID#Consistency>`__. Setting ``r`` or
+``w`` to 1 means you will receive a response after only one relevant
+node has responded.
+
+.. _cluster/sharding/move:
+
+Moving a shard
+--------------
+
+This section describes how to manually place and replace shards. These
+activities are critical steps when you determine your cluster is too big
+or too small, and want to resize it successfully, or you have noticed
+from server metrics that database/shard layout is non-optimal and you
+have some "hot spots" that need resolving.
+
+Consider a three-node cluster with q=8 and n=3. Each database has 24
+shards, distributed across the three nodes. If you add a fourth node to
+the cluster, CouchDB will not redistribute existing database shards to
+it. This leads to unbalanced load, as the new node will only host shards
+for databases created after it joined the cluster. To balance the
+distribution of shards from existing databases, they must be moved
+manually.
+
+Moving shards between nodes in a cluster involves the following steps:
+
+0. Ensure the target node has joined the cluster.
+1. Copy the shard(s) and any secondary index shard(s) onto the target node.
+2. Set the target node to maintenance mode.
+3. Update cluster metadata to reflect the new target shard(s).
+4. Monitor internal replication to ensure up-to-date shard(s).
+5. Clear the target node's maintenance mode.
+6. Update cluster metadata again to remove the source shard(s)
+7. Remove the shard file(s) and secondary index file(s) from the source node.
+
+Copying shard files
+~~~~~~~~~~~~~~~~~~~
+
+.. note::
+    Technically, copying database and secondary index
+    shards is optional. If you proceed to the next step without
+    performing
+    this data copy, CouchDB will use internal replication to populate
+    the
+    newly added shard replicas. However, this process can be very slow,
+    especially on a busy cluster—which is why we recommend performing this
+    manual data copy first.
+
+Shard files live in the ``data/shards`` directory of your CouchDB
+install. Within those subdirectories are the shard files themselves. For
+instance, for a ``q=8`` database called ``abc``, here is its database shard
+files:
+
+::
+
+  data/shards/00000000-1fffffff/abc.1529362187.couch
+  data/shards/20000000-3fffffff/abc.1529362187.couch
+  data/shards/40000000-5fffffff/abc.1529362187.couch
+  data/shards/60000000-7fffffff/abc.1529362187.couch
+  data/shards/80000000-9fffffff/abc.1529362187.couch
+  data/shards/a0000000-bfffffff/abc.1529362187.couch
+  data/shards/c0000000-dfffffff/abc.1529362187.couch
+  data/shards/e0000000-ffffffff/abc.1529362187.couch
+
+Since they are just files, you can use ``cp``, ``rsync``,
+``scp`` or any other command to copy them from one node to another. For
+example:
+
+.. code:: bash
+
+    # one one machine
+    $ mkdir -p data/shards/<range>
+    # on the other
+    $ scp <couch-dir>/data/shards/<range>/<database>.<datecode>.couch <node>:<couch-dir>/data/shards/<range>/
+
+Secondary indexes (including JavaScript views, Erlang views and Mango
+indexes) are also sharded, and their shards should be moved to save the
+new node the effort of rebuilding the view. View shards live in
+``data/.shards``. For example:
+
+::
+
+  data/.shards
+  data/.shards/e0000000-ffffffff/_replicator.1518451591_design
+  data/.shards/e0000000-ffffffff/_replicator.1518451591_design/mrview
+  data/.shards/e0000000-ffffffff/_replicator.1518451591_design/mrview/3e823c2a4383ac0c18d4e574135a5b08.view
+  data/.shards/c0000000-dfffffff
+  data/.shards/c0000000-dfffffff/_replicator.1518451591_design
+  data/.shards/c0000000-dfffffff/_replicator.1518451591_design/mrview
+  data/.shards/c0000000-dfffffff/_replicator.1518451591_design/mrview/3e823c2a4383ac0c18d4e574135a5b08.view
+  ...
+
+Set the target node to ``true`` maintenance mode
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Before telling CouchDB about the new shards on the node in question, it
 
 Review comment:
   it is ambiguous whether the `it` refers to CouchDB or the node in question. This could be read as disabling the cluster, instead of just the new node.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services