You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@couchdb.apache.org by ja...@apache.org on 2019/07/07 11:53:07 UTC
[couchdb-documentation] 01/01: rfc(per-doc-access): first draft
This is an automated email from the ASF dual-hosted git repository.
jan pushed a commit to branch rfc/010-per-document-access
in repository https://gitbox.apache.org/repos/asf/couchdb-documentation.git
commit bf834f4afa6d5a069606742659aabbfcd97c9acd
Author: Jan Lehnardt <ja...@apache.org>
AuthorDate: Sun Jul 7 13:52:49 2019 +0200
rfc(per-doc-access): first draft
---
rfcs/010-per-document-access-control.md | 397 ++++++++++++++++++++++++++++++++
1 file changed, 397 insertions(+)
diff --git a/rfcs/010-per-document-access-control.md b/rfcs/010-per-document-access-control.md
new file mode 100644
index 0000000..852f11b
--- /dev/null
+++ b/rfcs/010-per-document-access-control.md
@@ -0,0 +1,397 @@
+---
+name: Per-Document Access Control
+about: Make the db-per-user pattern obsolete.
+title: 'Per-Document Access Control'
+labels: rfc, discussion, access control, security
+assignees: '@janl'
+
+---
+
+# Introduction
+
+Up until now (version 2.3.1), CouchDB could not serve mutually
+untrusting users accessing the same database. If a user has access to
+one document in a database, they have access to all other documents in
+the database. Some restrictions can be added about writing documents
+(designs docs are db-admin only, validate doc update (VDU) functions
+could restrict write access based on the writing user and/or the target
+document). For the remainder of this document, “db-admin” SHALL include
+server admins as well.
+
+## Abstract
+
+This lead to CouchDB developers making use of a pattern called
+db-per-user, where all documents belonging to one user are kept in a
+separate database. This is a decent enough workaround, but has the
+following downsides:
+
+- queries across all databases are not possible. An additional
+ workaround exists where all per-user databases are replicated
+ continuously into a central, admin-only database that can be used for
+ querying the entire data set, but that adds latency and uses
+ significant CPU resources. Successful systems have been built where
+ increased latency could be traded for fewer CPU resources, but
+ overall, this is not an optimal design.
+
+- handling many small databases, say >10000 (depending on hardware) can
+ become a challenge, if most of them are active concurrently. It
+ forces dbs to be set to `q=1`, migrating off `q!=1` requires
+ downtime, 10k bidirectional replications are going to need A LOT of
+ CPU and RAM. sharing documents among two or more users requires the
+ creation of yet more databases.
+
+Per-user document access aims to solve many of the above problems.
+Predominantly, that multiple users can use a single database without
+being able to see each other’s documents. A first iteration is not
+going to solve sharing of documents across multiple users and/or groups.
+
+Goals for this iteration of this feature:
+
+* allow developers to build apps wihtout having to resort to using the
+ db-per-user pattern. Specifically PouchDB applications and CouchDB
+ setups with a central server/cluster and many independent satellite
+ installations with replication should be supported.
+
+Non-goals for now:
+
+* per-access views
+* differentiation between read and write access for documents
+* sharing infividual documents between multitple users or groups.
+
+However, the design of this iteration aims to allow turning these
+non-goals into actual goals later.
+
+## Dramatis Personae
+
+*user*: a CouchDB-user, a record defined in the _users db identified by
+a username and password, has associated roles.
+
+*developer*: creator of an application built on top of CouchDB
+
+## Requirements Language
+
+[NOTE]: # ( Do not alter the section below. Follow its instructions. )
+
+The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
+"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
+document are to be interpreted as described in
+[RFC 2119](https://www.rfc-editor.org/rfc/rfc2119.txt).
+
+---
+
+# Detailed Description
+
+You will be able to create databases with the “access” feature enabled
+via an option passed at database creation time. If you create a
+database without that option, it works like any database in CouchDB
+today.
+
+This is how you create an access-enabled database:
+
+```
+PUT /database?access=true
+```
+
+This option can be set only at database creation time, it can’t be
+turned off and on while the database exists.
+
+An access-enabled database behaves like this:
+
+* only admin users can read or write to the database (as per 3.x
+ defaults)
+
+* admins can grant individual users and groups access to a database
+ using the database’s `_security` object. A special new role `_users`
+ can be used to say “all users defined in the `_users` database”.
+
+* documents created without an `_access` field are accessible to
+ db-admins only
+
+ * this allows existing databases to be replicated into an
+ access-enabled database, but granting access of individual docs to
+ specific users needs to be an explicit step handled by developers.
+
+* documents created with an `_access` field are only accessible by
+ admins and the user named inside `_access`.
+
+ * `_access: ["shirley"]`
+
+ * later iterations of this could allow for `["shirley"]` being
+ shorthand for `[{"read": "shirley", "write": "shirley"}]` for
+ more fine-grained access control, but that is out of scope for
+ this RFC.
+
+ * users can only create documents with their own username in
+ `_access`.
+
+ * admins can add any users to `_access`
+
+ * documents can only be owned by one user at any one point.
+
+ * in a 2.0 > X < 4.0 cluster, two different users could create
+ the same document with a different _access definition
+ concurrently and both get successful write responses back. As
+ with _users documents in conflict, if a document has a conflict
+ with separate _access entries, it becomes admin-only by
+ default. This case needs to be handled by an applications
+ _conflict handler.
+
+* document _ids are shared across all users. So only the first user who
+ creates the doc `_id: config` gets it. Applications need to ensure to
+ work around this and potentially prefix docs with the username before
+ writing/replicating them in.
+
+* _security members are allowed to write design docs, but the have to
+ have an `_access` field and those design docs with an `_access` field
+ are ignored on the server side. Db-admin ddocs get indexes built as
+ normal.
+
+ * you can’t access their views, no view indexes are built, their
+ validate_doc_update functions do not run on db inserts.
+
+ * this allows full pouchdb / satellite db replication, but avoids
+ problems with having 10000s of VDUs or 10000s of view indexes.
+
+* users can not remove themselves from `_access`, nor can they remove
+ the `_access` property. They can only `DELETE` a doc.
+
+* If an existing doc changes the user mentioned in `_access` or an admin
+ user adds a non-admin user after updating the document a couple of
+ times, that new user will gain access to the full history of the
+ document.
+
+ * if compaction hasn’t run yet, they get access to all previous
+ revision bodies that still exist.
+
+ * all conflicted versions will also be visible to the new user
+
+ * regardless of compaction, they get access to the full list of
+ revision ids for the document. Extremely crafty people could try
+ to create a matching body for a revision they didn’t have access
+ to by trying to recreate an old hash.
+
+* accessing `_changes` gives users the subset of docs they own in last
+ updated order
+
+ * gaps in the sequence id would allow folks to deduce how many other
+ docs have been created/updated/deleted in between two of their
+ docs.
+
+ * this includes all the user’s docs PLUS all non-`_access` design
+ docs, so apps can centrally control design docs going down to
+ satellites.
+
+* accessing `_all_docs` gives users the subset of docs they own in `_id`
+ order.
+
+ * this includes all the user’s docs PLUS all non-`_access` design
+ docs, so apps can centrally control design docs going down to
+ satellites.
+
+* Replication check-points / local docs
+
+ * local docs behave exactly like regular docs in that they have to
+ include an _access property when being written by a non-admin user.
+
+ * this means that replicator implementations will have to be
+ amended to include that property in the checkpoint local docs
+ they write.
+
+ * that `_access` property then will also have to be included in
+ the replication session id calculation to make sure each user
+ gets their own replication id
+
+## Implementation Details
+
+The main addition is a new native query server called
+`couch_access_native_proc`, which implements two new indexes
+`by-access-id` and `by-access-seq` which do what you’d expect, pass in
+a userCtx and retrieve the equivalent of `_all_docs` or `_changes`, but
+only including those docs that match the username and roles in their
+`_access` property. The existing handlers for `_all_docs` and
+`_changes` have been augmented to use the new indexes instead of the
+default ones, unless the user is an admin.
+
+https://github.com/apache/couchdb/compare/access?expand=1&ws=0#diff-fbb5
+3323f07579be5e46ba63cb6701c4
+
+
+# Advantages and Disadvantages
+
+The downsides of this are the additional bookkeeping required in the
+newly created `by-access-seq` and `by-access-id` indexes. Given the
+resource requirements of the alternative db-per-user, this is a more
+than welcome trade-off.
+
+As a first iteration, this aims to tackle enough probelms to be useful
+for solving real-world problems people run into.
+
+I’m envisioning future iterations that add the following features:
+
+* per-access-seq powered views
+* differentiation between read and write access for documents
+* support for multiple users in `_access: []`
+* support for groups in `_access: []`
+
+The latter two might be better suited to be implemented on a future
+FoundationDB backend.
+
+All changes proposed here should translate seamlessly to a FoundationDB
+future.
+
+
+# Key Changes
+
+There are no default changes, but folks can op into the new behaviour.
+
+## Applications and Modules affected
+
+`couch`, `couch_mrview`, `couch_index`, `couch_replicator`, `chttpd`
+
+## HTTP API additions
+
+Note: this list is acopypasta from the 2.3.1 API documentation.
+
+`/db`
+
+* no changes
+
+`/db/_all_docs`
+`/db/{doc}`
+
+* admin: no changes
+* user: only the docs where `req.userCtx.name == _access: [$name]`
+
+`/db/_design_docs`
+
+* TBD: problem: maybe map admin-only ddocs as `_admin` in `_access`
+ index, and then use that for this endpoint. * that would probably
+ also help with loading ddocs for VDU evaluation
+
+`/db/_bulk_get`
+
+* admin: no changes
+* user: only the docs where `req.userCtx.name == _access: [$name]`
+* ids requested that belong to other users return an `{error: {reason:
+ unauthorized}}` row
+
+`/db/_bulk_docs`
+
+* admin: no changes
+* user: only the docs where` req.userCtx.name == _access: [$name]`
+* ids requested that belong to other users return an `{error: {reason:
+ unauthorized}}` row
+
+`/db/_find`
+`/db/_index`
+`/db/_explain`
+
+* admin only
+
+`/db/_shards` TBD probably no changes
+
+`/db/_shards/doc`
+
+* admin: no changes
+* user: only the docs where `req.userCtx.name == _access: [$name]` plus
+ non-_access ddocs
+
+`/db/_sync_shards` TBD probably no changes
+
+`/db/_changes`
+
+* admin: no changes
+* user: only the docs where `req.userCtx.name == _access: [$name]` plus non-_access ddocs
+
+`/db/_compact`
+`/db/_compact/design-doc`
+`/db/_ensure_full_commit`
+`/db/_view_cleanup`
+`/db/_security`
+`/db/_purged_infos_limit`
+`/db/_revs_limit`
+
+* all no changes
+
+`/db/_purge`
+
+* admin: no changes
+
+* user: only the docs where `req.userCtx.name == _access: [$name]`
+
+`/db/_missing_revs`
+
+* admin: no changes
+* user: only the docs where `req.userCtx.name == _access: [$name]`
+* users of _missing_revs (i.e. replicators) need to understand a new
+ response format which includes an {error: unauthorized} message.
+
+`/db/_revs_diff`
+
+* admin: no changes
+* user: only the docs where req.userCtx.name == _access: [$name]
+* users of _missing_revs (i.e. replicators) need to understand a new
+ response format which includes an {error: unauthorized} message.
+
+`/db/doc`
+
+* admin: no changes
+* user: only the docs where `req.userCtx.name == _access: [$name]`
+
+`/db/doc/attachment`
+
+* admin: no changes
+* user: only the docs where `req.userCtx.name == _access: [$name]`
+
+`/db/_design/design-doc`
+`/db/_design/design-doc/attachment`
+`/db/_design/design-doc/_info`
+
+* admin: no changes unless doc includes _access value
+* user: no access, see above
+
+`/db/_design/design-doc/_view/view-name`
+
+* admin: no changes
+* user: no access, see above
+
+`/db/_design/design-doc/_show/show-name`
+`/db/_design/design-doc/_show/show-name/doc-id`
+`/db/_design/design-doc/_list/list-name/view-name`
+`/db/_design/design-doc/_list/list-name/other-ddoc/view-name`
+`/db/_design/design-doc/_update/update-name`
+`/db/_design/design-doc/_update/update-name/doc-id`
+`/db/_design/design-doc/_rewrite/path`
+
+* these are available on non-_access ddocs only (or not supported, as
+ per other changes)
+
+`/db/_local_docs /db/_local/id`
+
+* admin: no changes
+* user: only the docs where `req.userCtx.name == _access: [$name]`
+* replication engines MUST be changed to include an _access member in
+ the replication definition that can be included in _local checkpoints
+ AND _access MUST be included in the session id calculation.
+
+## HTTP API deprecations
+
+None
+
+# Security Considerations
+
+This is a significant change to the CouchDB security model. All of the
+above are security considerations.
+
+# References
+
+https://lists.apache.org/thread.html/6aa77dd8e5974a3a540758c6902ccb509ab5a2e4802ecf4fd724a5e4@%3Cdev.couchdb.apache.org%3E
+
+https://lists.apache.org/thread.html/1aae26aa329817d8c54bab615a0df1c3a7b0fd34f17a2321ecf047f3@%3Cdev.couchdb.apache.org%3E
+
+
+# Acknowledgements
+
+Thanks to @wohali who helped me talk some of these things through and
+of course all of dev@, specifically the Boston Summit attendees for
+kickstarting this effort.