You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Chad George <ch...@mgproducts.com> on 2010/11/01 20:33:10 UTC

Document level security

I've been watching the CouchDB project for a couple years and just recently
become very interested in using it for some projects. Over the past few
weeks I've been trying to fully grok the CouchDB way.

>From what I can gather, I think the topic of document level security has
been raised and rejected pretty often, but I didn't see anything resembling
my idea so I thought I see if there was any wisdom out there for or against
it.

First, my motivation is that I really like the idea of CouchDB + browser =
entire web stack. So I'm coming at this from that perspective (with a third
app server layer this entire suggestion is irrelevant) Second, I like the
idea of separate databases + replication filtering for establishing the
primary security barriers for a mixed public/private application. But what I
think is missing is a mechanism to sanitize documents of sensitive
information, especially documents exposed to anonymous users.

I propose adding a special field to the document API like "_render" that
contains either a javascript function or a string "path" to a function
inside of a design document on the same database.
  1. the current non-filtering behavior would be equivalent to
"function(doc, req) { return doc; }"
  2. the idea is this _render function would be called any time a document
is retrieved directly from the database: HTTP get, View w/include_docs, etc.
  3. unlike a normal show() function, the result of the _render function
should still be a JSON document so it can be used in view results and other
multi-doc situations.
  4. Maybe the "_rev" in the returned document should be updated to prevent
inadvertently writing "sanitized" version back to the database directly.
     - on second thought the _render function should probably do this
itself, since this might be a nice way to update the schema of a document
when it is saved back.
  5. the HTTP header information being used to control caching is definitely
an issue to consider ... not sure how this impacts the idea

I was think that normally documents will use { _render :
"_design/app/renders/filter_me" } to make it easy to lockdown/sanitize an
entire database or class of documents. But I don't see any reason not to
allow the  _render = "function(doc, req)"  version.

I can appreciate the serious concern that per-document security results in a
performance hit, and its difficult to do anyways once info is embedded into
a view.
 1. I'm most concerned on how much of a performance hit would be incurred in
checking if a document has the "_render" field, since we definitely don't
want to slow down the default case.
 2. I'm not really concern with data thats baked into the view results
already since temp views are admin restricted already and I'm assuming the
designer could keep unsafe data out of the view itself, so we only need to
sanitize include_docs.

I realize this eliminates the ability to do GET-modify-PUT for a document,
but the updates feature makes this unnecessary.

I think the trickiest part is how this relates to replication:
1. the easiest path is to bypass _render functions for replication and rely
on replication filters for all security related issues
   - might need to have a filter that is *always* called during replication
to prevent an anonymous user replicating something they shouldn't
2. changes to _render functions defined in the design document wouldn't
cause all the documents that *could* be affected from being replicated
again, but I'm not sure this matters that much.
   - it seems pretty trivial to have an external script (maybe even
couchapp) go thru and "touch" all the affected documents so they do get
re-evaluated next replication cycle.
3. one way replication from a "more" secure database to a "less" secure
database should work fine I think (especially if "_rev" fields are being
updated appropriately) ... other replication scenarios will probably need
some careful use of replication filtering to work correctly.


Finally, I'm just using "_render" for discussion, I'm open to any field name
that people thing makes sense.

----

that was probably long for a first post, but hopefully it sparks a good
discussion.
- Chad