You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@couchdb.apache.org by ja...@apache.org on 2019/07/07 11:53:06 UTC

[couchdb-documentation] branch rfc/010-per-document-access created (now bf834f4)

This is an automated email from the ASF dual-hosted git repository.

jan pushed a change to branch rfc/010-per-document-access
in repository https://gitbox.apache.org/repos/asf/couchdb-documentation.git.


      at bf834f4  rfc(per-doc-access): first draft

This branch includes the following new commits:

     new bf834f4  rfc(per-doc-access): first draft

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.



[couchdb-documentation] 01/01: rfc(per-doc-access): first draft

Posted by ja...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

jan pushed a commit to branch rfc/010-per-document-access
in repository https://gitbox.apache.org/repos/asf/couchdb-documentation.git

commit bf834f4afa6d5a069606742659aabbfcd97c9acd
Author: Jan Lehnardt <ja...@apache.org>
AuthorDate: Sun Jul 7 13:52:49 2019 +0200

    rfc(per-doc-access): first draft
---
 rfcs/010-per-document-access-control.md | 397 ++++++++++++++++++++++++++++++++
 1 file changed, 397 insertions(+)

diff --git a/rfcs/010-per-document-access-control.md b/rfcs/010-per-document-access-control.md
new file mode 100644
index 0000000..852f11b
--- /dev/null
+++ b/rfcs/010-per-document-access-control.md
@@ -0,0 +1,397 @@
+---
+name: Per-Document Access Control
+about: Make the db-per-user pattern obsolete.
+title: 'Per-Document Access Control'
+labels: rfc, discussion, access control, security
+assignees: '@janl'
+
+---
+
+# Introduction
+
+Up until now (version 2.3.1), CouchDB could not serve mutually
+untrusting users accessing the same database. If a user has access to
+one document in a database, they have access to all other documents in
+the database. Some restrictions can be added about writing documents
+(designs docs are db-admin only, validate doc update (VDU) functions
+could restrict write access based on the writing user and/or the target
+document). For the remainder of this document, “db-admin” SHALL include
+server admins as well.
+
+## Abstract
+
+This lead to CouchDB developers making use of a pattern called
+db-per-user, where all documents belonging to one user are kept in a
+separate database. This is a decent enough workaround, but has the
+following downsides:
+
+- queries across all databases are not possible. An additional
+  workaround exists where all per-user databases are replicated
+  continuously into a central, admin-only database that can be used for
+  querying the entire data set, but that adds latency and uses
+  significant CPU resources. Successful systems have been built where
+  increased latency could be traded for fewer CPU resources, but
+  overall, this is not an optimal design.
+
+- handling many small databases, say >10000 (depending on hardware) can
+  become a challenge, if most of them are active concurrently. It
+  forces dbs to be set to `q=1`, migrating off `q!=1` requires
+  downtime, 10k bidirectional replications are going to need A LOT of
+  CPU and RAM. sharing documents among two or more users requires the
+  creation of yet more databases.
+
+Per-user document access aims to solve many of the above problems.
+Predominantly, that multiple users can use a single database without
+being able to see each other’s documents. A first iteration is not
+going to solve sharing of documents across multiple users and/or groups.
+
+Goals for this iteration of this feature:
+
+* allow developers to build apps wihtout having to resort to using the
+  db-per-user pattern. Specifically PouchDB applications and CouchDB
+  setups with a central server/cluster and many independent satellite
+  installations with replication should be supported.
+
+Non-goals for now:
+
+* per-access views
+* differentiation between read and write access for documents
+* sharing infividual documents between multitple users or groups.
+
+However, the design of this iteration aims to allow turning these
+non-goals into actual goals later.
+
+## Dramatis Personae
+
+*user*: a CouchDB-user, a record defined in the _users db identified by
+a username and password, has associated roles.
+
+*developer*: creator of an application built on top of CouchDB
+
+## Requirements Language
+
+[NOTE]: # ( Do not alter the section below. Follow its instructions. )
+
+The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
+"SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in this
+document are to be interpreted as described in
+[RFC 2119](https://www.rfc-editor.org/rfc/rfc2119.txt).
+
+---
+
+# Detailed Description
+
+You will be able to create databases with the “access” feature enabled
+via an option passed at database creation time. If you create a
+database without that option, it works like any database in CouchDB
+today.
+
+This is how you create an access-enabled database:
+
+```
+PUT /database?access=true
+```
+
+This option can be set only at database creation time, it can’t be
+turned off and on while the database exists.
+
+An access-enabled database behaves like this:
+
+* only admin users can read or write to the database (as per 3.x
+  defaults)
+
+* admins can grant individual users and groups access to a database
+  using the database’s `_security` object. A special new role `_users`
+  can be used to say “all users defined in the `_users` database”.
+
+* documents created without an `_access` field are accessible to
+  db-admins only
+
+   * this allows existing databases to be replicated into an
+     access-enabled database, but granting access of individual docs to
+     specific users needs to be an explicit step handled by developers.
+
+* documents created with an `_access` field are only accessible by
+  admins and the user named inside `_access`.
+
+   * `_access: ["shirley"]`
+
+      * later iterations of this could allow for `["shirley"]` being
+        shorthand for `[{"read": "shirley", "write": "shirley"}]` for
+        more fine-grained access control, but that is out of scope for
+        this RFC.
+
+   * users can only create documents with their own username in
+     `_access`.
+
+   * admins can add any users to `_access`
+
+   * documents can only be owned by one user at any one point.
+
+      * in a 2.0 > X < 4.0 cluster, two different users could create
+        the same document with a different _access definition
+        concurrently and both get successful write responses back. As
+        with _users documents in conflict, if a document has a conflict
+        with separate _access entries, it becomes admin-only by
+        default. This case needs to be handled by an applications
+        _conflict handler.
+
+* document _ids are shared across all users. So only the first user who
+  creates the doc `_id: config` gets it. Applications need to ensure to
+  work around this and potentially prefix docs with the username before
+  writing/replicating them in.
+
+* _security members are allowed to write design docs, but the have to
+  have an `_access` field and those design docs with an `_access` field
+  are ignored on the server side. Db-admin ddocs get indexes built as
+  normal.
+
+   * you can’t access their views, no view indexes are built, their
+     validate_doc_update functions do not run on db inserts.
+
+   * this allows full pouchdb / satellite db replication, but avoids
+     problems with having 10000s of VDUs or 10000s of view indexes.
+
+* users can not remove themselves from `_access`, nor can they remove
+  the `_access` property. They can only `DELETE` a doc.
+
+* If an existing doc changes the user mentioned in `_access` or an admin
+  user adds a non-admin user after updating the document a couple of
+  times, that new user will gain access to the full history of the
+  document.
+
+   * if compaction hasn’t run yet, they get access to all previous
+     revision bodies that still exist.
+
+   * all conflicted versions will also be visible to the new user
+
+   * regardless of compaction, they get access to the full list of
+     revision ids for the document. Extremely crafty people could try
+     to create a matching body for a revision they didn’t have access
+     to by trying to recreate an old hash.
+
+* accessing `_changes` gives users the subset of docs they own in last
+  updated order
+
+   * gaps in the sequence id would allow folks to deduce how many other
+     docs have been created/updated/deleted in between two of their
+     docs.
+
+      * this includes all the user’s docs PLUS all non-`_access` design
+        docs, so apps can centrally control design docs going down to
+        satellites.
+
+* accessing `_all_docs` gives users the subset of docs they own in `_id`
+  order.
+
+   * this includes all the user’s docs PLUS all non-`_access` design
+     docs, so apps can centrally control design docs going down to
+     satellites.
+
+* Replication check-points / local docs
+
+   * local docs behave exactly like regular docs in that they have to
+     include an _access property when being written by a non-admin user.
+
+      * this means that replicator implementations will have to be
+        amended to include that property in the checkpoint local docs
+        they write.
+
+      * that `_access` property then will also have to be included in
+        the replication session id calculation to make sure each user
+        gets their own replication id
+
+## Implementation Details
+
+The main addition is a new native query server called
+`couch_access_native_proc`, which implements two new indexes
+`by-access-id` and `by-access-seq` which do what you’d expect, pass in
+a userCtx and retrieve the equivalent of `_all_docs` or `_changes`, but
+only including those docs that match the username and roles in their
+`_access` property. The existing handlers for `_all_docs` and
+`_changes` have been augmented to use the new indexes instead of the
+default ones, unless the user is an admin.
+
+https://github.com/apache/couchdb/compare/access?expand=1&ws=0#diff-fbb5
+3323f07579be5e46ba63cb6701c4
+
+
+# Advantages and Disadvantages
+
+The downsides of this are the additional bookkeeping required in the
+newly created `by-access-seq` and `by-access-id` indexes. Given the
+resource requirements of the alternative db-per-user, this is a more
+than welcome trade-off.
+
+As a first iteration, this aims to tackle enough probelms to be useful
+for solving real-world problems people run into.
+
+I’m envisioning future iterations that add the following features:
+
+* per-access-seq powered views
+* differentiation between read and write access for documents
+* support for multiple users in `_access: []`
+* support for groups in `_access: []`
+
+The latter two might be better suited to be implemented on a future
+FoundationDB backend.
+
+All changes proposed here should translate seamlessly to a FoundationDB
+future.
+
+
+# Key Changes
+
+There are no default changes, but folks can op into the new behaviour.
+
+## Applications and Modules affected
+
+`couch`, `couch_mrview`, `couch_index`, `couch_replicator`, `chttpd`
+
+## HTTP API additions
+
+Note: this list is acopypasta from the 2.3.1 API documentation.
+
+`/db`
+
+* no changes
+
+`/db/_all_docs`  
+`/db/{doc}`  
+
+* admin: no changes
+* user: only the docs where `req.userCtx.name == _access: [$name]`
+
+`/db/_design_docs`
+
+* TBD: problem: maybe map admin-only ddocs as `_admin` in `_access`
+  index, and then use that for this endpoint. * that would probably
+  also help with loading ddocs for VDU evaluation
+
+`/db/_bulk_get`
+
+* admin: no changes
+* user: only the docs where `req.userCtx.name == _access: [$name]`
+* ids requested that belong to other users return an `{error: {reason:
+  unauthorized}}` row
+
+`/db/_bulk_docs`
+
+* admin: no changes
+* user: only the docs where` req.userCtx.name == _access: [$name]`
+* ids requested that belong to other users return an `{error: {reason:
+  unauthorized}}` row
+
+`/db/_find`  
+`/db/_index`  
+`/db/_explain`  
+
+* admin only
+
+`/db/_shards` TBD probably no changes
+
+`/db/_shards/doc`
+
+* admin: no changes
+* user: only the docs where `req.userCtx.name == _access: [$name]` plus
+ non-_access ddocs
+
+`/db/_sync_shards` TBD probably no changes
+
+`/db/_changes`
+
+* admin: no changes
+* user: only the docs where `req.userCtx.name == _access: [$name]` plus non-_access ddocs
+
+`/db/_compact`  
+`/db/_compact/design-doc`  
+`/db/_ensure_full_commit`  
+`/db/_view_cleanup`  
+`/db/_security`  
+`/db/_purged_infos_limit`  
+`/db/_revs_limit`  
+
+* all no changes
+
+`/db/_purge`
+
+* admin: no changes
+
+* user: only the docs where `req.userCtx.name == _access: [$name]`
+
+`/db/_missing_revs`
+
+* admin: no changes
+* user: only the docs where `req.userCtx.name == _access: [$name]`
+* users of _missing_revs (i.e. replicators) need to understand a new
+  response format which includes an {error: unauthorized} message.
+
+`/db/_revs_diff`
+
+* admin: no changes
+* user: only the docs where req.userCtx.name == _access: [$name]
+* users of _missing_revs (i.e. replicators) need to understand a new
+  response format which includes an {error: unauthorized} message.
+
+`/db/doc`
+
+* admin: no changes
+* user: only the docs where `req.userCtx.name == _access: [$name]`
+
+`/db/doc/attachment`
+
+* admin: no changes
+* user: only the docs where `req.userCtx.name == _access: [$name]`
+
+`/db/_design/design-doc`  
+`/db/_design/design-doc/attachment`  
+`/db/_design/design-doc/_info`  
+
+* admin: no changes unless doc includes _access value
+* user: no access, see above
+
+`/db/_design/design-doc/_view/view-name`
+
+* admin: no changes
+* user: no access, see above
+
+`/db/_design/design-doc/_show/show-name`  
+`/db/_design/design-doc/_show/show-name/doc-id`  
+`/db/_design/design-doc/_list/list-name/view-name`  
+`/db/_design/design-doc/_list/list-name/other-ddoc/view-name`  
+`/db/_design/design-doc/_update/update-name`  
+`/db/_design/design-doc/_update/update-name/doc-id`  
+`/db/_design/design-doc/_rewrite/path`  
+
+* these are available on non-_access ddocs only (or not supported, as
+  per other changes)
+
+`/db/_local_docs /db/_local/id`
+
+* admin: no changes
+* user: only the docs where `req.userCtx.name == _access: [$name]`
+* replication engines MUST be changed to include an _access member in
+  the replication definition that can be included in _local checkpoints
+  AND _access MUST be included in the session id calculation.
+
+## HTTP API deprecations
+
+None
+
+# Security Considerations
+
+This is a significant change to the CouchDB security model. All of the
+above are security considerations.
+
+# References
+
+https://lists.apache.org/thread.html/6aa77dd8e5974a3a540758c6902ccb509ab5a2e4802ecf4fd724a5e4@%3Cdev.couchdb.apache.org%3E
+
+https://lists.apache.org/thread.html/1aae26aa329817d8c54bab615a0df1c3a7b0fd34f17a2321ecf047f3@%3Cdev.couchdb.apache.org%3E
+
+
+# Acknowledgements
+
+Thanks to @wohali who helped me talk some of these things through and
+of course all of dev@, specifically the Boston Summit attendees for
+kickstarting this effort.