You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@couchdb.apache.org by da...@apache.org on 2019/01/26 19:05:35 UTC
[couchdb-documentation] branch feature/database-partitions updated:
Apply suggestions from code review
This is an automated email from the ASF dual-hosted git repository.
davisp pushed a commit to branch feature/database-partitions
in repository https://gitbox.apache.org/repos/asf/couchdb-documentation.git
The following commit(s) were added to refs/heads/feature/database-partitions by this push:
new bbf9edd Apply suggestions from code review
bbf9edd is described below
commit bbf9edde2d038ebf352ff4a3913293c6f40ef69f
Author: Jonathan Hall <fl...@flimzy.com>
AuthorDate: Sat Jan 26 13:05:31 2019 -0600
Apply suggestions from code review
Co-Authored-By: davisp <pa...@gmail.com>
---
src/partitioned-dbs/index.rst | 62 +++++++++++++++++++++----------------------
1 file changed, 31 insertions(+), 31 deletions(-)
diff --git a/src/partitioned-dbs/index.rst b/src/partitioned-dbs/index.rst
index e1bdf54..03a2fc2 100644
--- a/src/partitioned-dbs/index.rst
+++ b/src/partitioned-dbs/index.rst
@@ -82,30 +82,30 @@ and:
}
With these two indexes defined we can easily find all requests for a given
-sensor or list all sensors in a given field.
+sensor, or list all sensors in a given field.
-Unfortunately, in CouchDB when we read from either of these indexes it
+Unfortunately, in CouchDB, when we read from either of these indexes, it
requires finding a copy of every shard and asking for any documents related
to the particular sensor or field. This means that as our database scales
-up the number of shards, the more work every index request must perform.
-Fortunately for you dear reader, partitioned databases were created to solve
+up the number of shards, every index request must perform more work.
+Fortunately for you, dear reader, partitioned databases were created to solve
this precise problem.
What is a partition?
====================
-In the previous section we introduced a hypothetical database that contains
+In the previous section, we introduced a hypothetical database that contains
sensor readings from an IoT field monitoring service. In this particular
-use case it's quite logical to group all documents by their ``sensor_id``
-field. In this case we would call the sensor_id the partition.
+use case, it's quite logical to group all documents by their ``sensor_id``
+field. In this case, we would call the ``sensor_id`` the partition.
A good partition has two basic properties. First, it should have a high
cardinality. That is, there is a large number of values for the partition.
A database that has a single partition would be an anti-pattern for this
feature. Secondly, the amount of data per partition should be "small". The
general recommendation is to limit individual partitions to less than ten
-gigabytes of data. Which for the example sensor documents equates to roughly
+gigabytes of data. Which, for the example of sensor documents, equates to roughly
60,000 years of data.
@@ -117,17 +117,17 @@ of partitioned queries. Large databases with lots of documents often
have a similar pattern where there are groups of related documents that
are queried together often.
-By using partitions we can execute queries against these individual groups
+By using partitions, we can execute queries against these individual groups
of documents more efficiently by placing the entire group within a specific
-shard on disk. Thus the view engine only has to consult one copy of the
-given shard range when executing a query instead of the normal requirement
-having to execute the query across all ``Q`` shards in the database.
+shard on disk. Thus, the view engine only has to consult one copy of the
+given shard range when executing a query instead of executing
+the query across all ``Q`` shards in the database.
Partitions By Example
=====================
-To create a partitioned database we simply need to pass a query string
+To create a partitioned database, we simply need to pass a query string
parameter.
.. code-block:: bash
@@ -135,7 +135,7 @@ parameter.
shell> curl -X PUT http://127.0.0.1:5984/my_new_db?partitioned=true
{"ok":true}
-To see that our database is partitioned we can look at the database
+To see that our database is partitioned, we can look at the database
information:
.. code-block:: bash
@@ -181,8 +181,8 @@ You'll now see that the ``"props"`` member contains ``"partitioned": true``.
except design and local documents) in a partitioned database
must follow this format.
-Now that we've created a partitioned database its time to add some documents.
-Using our earlier example we could do this as such:
+Now that we've created a partitioned database, it's time to add some documents.
+Using our earlier example, we could do this as such:
.. code-block:: bash
@@ -208,20 +208,20 @@ Using our earlier example we could do this as such:
}
The only change required to the first example document is that we are now
-including the partition name in the document id by prepending the
+including the partition name in the document id by prepending it to the
old id separated by a colon.
.. note::
- The partition name in the document id is not magical. Internally
+ The partition name in the document id is not magical. Internally,
the database is simply using only the partition for hashing
- the document to a given shard instead of the entire document id.
+ the document to a given shard, instead of the entire document id.
Working with documents in a partitioned database is no different than
-a non-partitioned database. All APIs are available and existing client
+a non-partitioned database. All APIs are available, and existing client
code will all work seamlessly.
-Now that we have created a document we can get some info about the partition
+Now that we have created a document, we can get some info about the partition
containing the document:
.. code-block:: bash
@@ -274,7 +274,7 @@ id.
}
}
-We can go ahead and upload our design document and try out a partitioned
+We can upload our design document and try out a partitioned
query:
.. code-block:: bash
@@ -296,16 +296,16 @@ query:
{"id":"sensor-260:sensor-reading-ca33c748-2d2c-4ed1-8abf-1bca4d9d03cf","key":["sensor-260","3"],"value":null}
]}
-Hooray! Our first partitioned query. For experienced users that may not
-be the most exciting development given that the only things that have
-changed are a slight tweak to the document id and accessing views with
-a slightly different path. However, for anyone that likes performance
-improvements its actually a big deal. By knowing that the view results
-are all located within the provided partition name our partitioned
+Hooray! Our first partitioned query. For experienced users, that may not
+be the most exciting development, given that the only things that have
+changed are a slight tweak to the document id, and accessing views with
+a slightly different path. However, for anyone who likes performance
+improvements, its actually a big deal. By knowing that the view results
+are all located within the provided partition name, our partitioned
queries now perform nearly as fast as document lookups!
The last thing we'll look at is how to query data across multiple partitions.
-For that we'll implement the example sensors by field query from our
+For that, we'll implement the example sensors by field query from our
initial example. The map function will use the same update to account
for the new document id format, but is otherwise identical to the previous
version:
@@ -366,8 +366,8 @@ request like:
]}
Notice that we're not using the ``/dbname/_partition/...`` path for global
-queries. This is because global queries by definition to not cover a single
-partition. Other than having the `"partitioned": false` parameter in the
+queries. This is because global queries, by definition, do not cover individual
+partitions. Other than having the `"partitioned": false` parameter in the
design document, global design documents and queries are identical in
behavior to design documents on non-partitioned databases.