You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@couchdb.apache.org by GitBox <gi...@apache.org> on 2018/07/18 09:48:32 UTC

[GitHub] willholley commented on a change in pull request #268: Rewrite sharding documentation

willholley commented on a change in pull request #268: Rewrite sharding documentation
URL: https://github.com/apache/couchdb-documentation/pull/268#discussion_r203316574
 
 

 ##########
 File path: src/cluster/sharding.rst
 ##########
 @@ -12,290 +12,490 @@
 
 .. _cluster/sharding:
 
-========
-Sharding
-========
+================
+Shard Management
+================
 
 .. _cluster/sharding/scaling-out:
 
-Scaling out
-===========
+Introduction
+------------
 
-Normally you start small and grow over time. In the beginning you might do just
-fine with one node, but as your data and number of clients grows, you need to
-scale out.
+A `shard
+<https://en.wikipedia.org/wiki/Shard_(database_architecture)>`__ is a
+horizontal partition of data in a database. Partitioning data into
+shards and distributing copies of each shard (called "shard replicas" or
+just "replicas") to different nodes in a cluster gives the data greater
+durability against node loss. CouchDB clusters automatically shard
+databases and distribute the subsections of documents that compose each
+shard among nodes. Modifying cluster membership and sharding behavior
+must be done manually.
 
-For simplicity we will start fresh and small.
+Shards and Replicas
+~~~~~~~~~~~~~~~~~~~
 
-Start ``node1`` and add a database to it. To keep it simple we will have 2
-shards and no replicas.
+How many shards and replicas each database has can be set at the global
+level, or on a per-database basis. The relevant parameters are *q* and
+*n*.
 
-.. code-block:: bash
+*q* is the number of database shards to maintain. *n* is the number of
+copies of each document to distribute. The default value for n is 3,
+and for q is 8. With q=8, the database is split into 8 shards. With
+n=3, the cluster distributes three replicas of each shard. Altogether,
+that's 24 shards for a single database. In a default 3-node cluster,
+each node would receive 8 shards. In a 4-node cluster, each node would
+receive 6 shards. We recommend in the general case that the number of
+nodes in your cluster should be a multiple of n, so that shards are
+distributed evenly.
 
-    curl -X PUT "http://xxx.xxx.xxx.xxx:5984/small?n=1&q=2" --user daboss
+CouchDB nodes have a ``etc/default.ini`` file with a section named
+``[cluster]`` which looks like this:
 
-If you look in the directory ``data/shards`` you will find the 2 shards.
+::
 
-.. code-block:: text
+    [cluster]
+    q=8
+    n=3
 
-    data/
-    +-- shards/
-    |   +-- 00000000-7fffffff/
-    |   |    -- small.1425202577.couch
-    |   +-- 80000000-ffffffff/
-    |        -- small.1425202577.couch
+These settings can be modified to set sharding defaults for all
+databases, or they can be set on a per-database basis by specifying the
+``q`` and ``n`` query parameters when the database is created. For
+example:
 
-Now, check the node-local ``_dbs_`` database. Here, the metadata for each
-database is stored. As the database is called ``small``, there is a document
-called ``small`` there. Let us look in it. Yes, you can get it with curl too:
+.. code:: bash
 
-.. code-block:: javascript
+    $ curl -X PUT "$COUCH_URL/database-name?q=4&n=2"
 
-    curl -X GET "http://xxx.xxx.xxx.xxx:5986/_dbs/small"
+That creates a database that is split into 4 shards and 2 replicas,
 
 Review comment:
   n=2 is probably not an example to promote as you'll get quorum failure if one replica is unavailable. `n=1` is perhaps useful for users who want to maintain CouchDB 1.x consistency semantics and don't need HA.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services