You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by Chris Anderson <jc...@apache.org> on 2010/02/03 18:21:10 UTC

DB ACLs (was Re: 0.11 Release / Feature Freeze for 1.0)

On Wed, Feb 3, 2010 at 6:23 AM, Brian Candler <B....@pobox.com> wrote:
> I see the readeracl branch was recently merged into trunk, and I've just
> been testing it again.
>
> My concern is that the design is flawed, and that if this goes into 0.11
> then we are stuck with it forever; so it's better to address these sooner
> rather than later.

Let me see if I can address some of these concerns.

>
> I do see logic in keeping the admin/reader authorizations for a database
> within the database itself. The problems are:
>
> (1) _all_dbs currently shows everything - even those databases you don't
> have access to.
>
> I believe that in its current form, _all_dbs simply won't scale to millions
> of databases on a box if you want to limit it to accessible dbs only.
>

This is an interesting one. _all_dbs won't scale indefinitely even
before this patch, because it has no built-in pagination abilities.
Enhancing this feature to look into each file and keep going to till
it finds N that can be listed isn't hard to code. It will be a little
more work to make _all_dbs respect startkey and endkey.

I'd be thrilled if someone would patch _all_dbs to act correctly.

> (2) _readers is a single monolithic object. I believe that it won't scale to
> millions of users having access to the same database.

It's not meant to support this use case. If you have millions of users
with the same access rights, give them a common role and give that
role access to the database.

>
> (3) _readers has no concurrency control. One admin making an ACL change in
> futon (say) will silently overwrite changes made around the same time by
> another admin. This will get worse the more frequently users are added and
> removed.

_readers / _admins / _security are stored as a raw object without
concurrency control, because keeping them as a document adds too much
performance overhead on each request. Concurrency control is a
tradeoff we make here.

>
> For me, those are serious problems. I sketched a design for an alternative
> approach, using the _user db to hold the authorizations in terms of
> database-specific roles.  Unfortunately I didn't have time to contribute an
> implementation of this.  If there's a chance this alternative approach would
> be used then I will try to steal the time from somewhere.  The ideas behind
> it weren't explicitly rejected, but neither were they acknowledged as a good
> approach.

The database-specfic roles and names don't belong in the users db. The
users db is for answering the question: "who is the user and what
roles do they have". The ACLs say which names and roles can read or
admin a given database.

It's a fact of life that users can rsync db-files around. If the names
/ roles are in the users db, they get wrong when databases are moved
to another host or renamed on the current host.


>
> (4) An "admin" is not a "reader", and this is clearly intentional from
> comments in the code. However, someone who has an "admin" role without
> "reader" role is unable to perform ACL changes, which for me defeats the
> whole purpose of the "admin" role.
>
>
> So I'd propose that the relaxed approach is that a database "admin" should
> inherit "reader" rights. Isn't that true for a server-level admin anyway?

I considered this, but dropped it because it seemed like it'd send the
wrong message (if you put yourself as the only db-admin, this doesn't
make the db private, to do that you need to edit the readers.)

Looking at your shell transcript makes me think it'd make life easier
if we grant db-admins the reader privilege as well (so long as it
remains clear that dbs with admins but no readers are public).

4 is fixed.

>
> (5) Non-admin readers can view the entire _readers, _admins and _security
> resources.  I think this is quite a severe privacy concern, but it is easily
> fixed.

They can also read the design document. I'm not sure why this is a
privacy concern. A user may need to contact a db admin for help with
something, it's handy to be able to get a list of them. And it only
makes sense that you should be able to see the list of users who can
also access the same db you can.

If there's consensus that this is indeed an issue, it's not a hard
thing to change in the code.

>
> (6) Databases are created world-readable by default, which means a race to
> get the _readers set before someone else starts inserting documents. I think
> a PUT /dbname option to set a non-empty readers list would be a good idea
> (and a corresponding checkbox in futon)

Yes, I'd like to add some UI to the create-database form in Futon that
immediately sets a reader == the current user. The UI would be a
checkbox that says "make database private" or something. Making this
synchronous with db-creation would be a neat trick, too.

>
> (7) Couchdb accepts nonsense _readers documents, e.g.
>
> $ curl -X PUT -d '{"names":{"foo":"bar"},"roles":456}'
> http://admin:admin@127.0.0.1:5984/briantest/_readers
> {"ok":true}
>
> The effect is to reset the _readers document to its permit-all default, thus
> opening up the database to the world.
>
> $ curl http://127.0.0.1:5984/briantest/_readers
> {"names":[],"roles":[]}

My main concern in coding this was validating that malformed data
can't be stored. I've fixed this to throw errors in the case of
malformed data (instead of ignoring it).

>
> (8) Point (7) is arguably a simple bug which can be fixed, but I'd prefer
> for couchdb to be fail-safe; that is, an empty ACL means nobody has access.

The goal with this patch was to add some easy ACL functionality
without changing the default behavior. People who are happy to use
Couch in admin-party mode shouldn't see any changes.

>
> One way to achieve this would be for two new roles, "_anon" and "_user",
> granted to all unauthenticated and authenticated users respectively. Then a
> fully public database would have roles:["_anon","_user"], and this would
> be added to a new database unless you ask otherwise (see point (6)).

This ["_anon","_user"] business seems fine to me. It's a lot more
complexity for basically the same capabilities (although I do like the
ability to lock out anon users on a per-db basis, that's pretty cool.)

If anyone writes this, I'd be happy to see it. This won't be a small
patch because it's got to introduce those roles as well as use them in
the reader checks.

>
> (9) The _users db itself is world-readable (showing not only who your users
> are, but their password hashes). Highly undesirable.

I actually consider this a feature. We'd like to get some stronger
password hashing (see the bcrypt threads) which should help with the
password parts.

>
> You can set a _readers ACL on it, but it has consequences:
> * users can't sign up for new accounts
> * users can't change their own passwords
> This forces such things through a privileged external interface. Actually
> I'm fine with that, because I want to validate signups anyway, but others
> might not be.
>
> (10) I don't think you can replicate _readers, _admins and _security, unless
> (a) you are doing rsync filesystem-level replication, or (b) you explicitly
> GET and PUT these resources from one DB to another. This is arguably a
> feature not a bug.
>
> (11) Trivial problem of the day: _security resources which are not objects
> give an erlang-flavoured error.
>
> $ curl -X PUT -d '["foo","bar"]' http://brianadmin:brianadmin@127.0.0.1:5984/briantest/_security
> {"error":"unknown_error","reason":"function_clause"}
>
> Sorry for the long post, and if I really am barking up the wrong tree here,
> please tell me so.

Thanks for pointing out a lot of small bugs. The ones I think have merit are:

1, 4, 6, 7, 8, 11

Of these I've fixed 4, 7 and 11 this morning.

If you can create Jira tickets for 1, 6, and 8 that'd be nice.

Chris

>
> Regards,
>
> Brian.
>



-- 
Chris Anderson
http://jchrisa.net
http://couch.io

Re: DB ACLs (was Re: 0.11 Release / Feature Freeze for 1.0)

Posted by Chris Anderson <jc...@apache.org>.
On Tue, Feb 16, 2010 at 12:08 PM, Brian Candler <B....@pobox.com> wrote:
> On Tue, Feb 16, 2010 at 09:56:57AM -0800, Chris Anderson wrote:
>> This is just an artifact of following trunk. Because _admins used to
>> store a mixed list of names and roles, users upgrading from 0.10.x
>> will lose any defined db-admins. I think this is the most secure
>> choice.
>
> Just to be clear, I didn't exactly lose the admins. The data structure on
> disk had the admins, but the proplist had an atom key 'admins' instead of a
> binary key <<"admins">>. Similarly, there were atoms for 'names' and 'roles'
> instead of binaries <<"names">> and <<"roles">>
>
> So proplist:get_value(<<"admins">>, SecProp) wasn't finding
> {admins,{...}}
>
> It just looks like the on-disk format has changed at some point.
>

Yes. For a week or so trunk was using atom keys, now it is using
binaries.

Chris

> B.
>



-- 
Chris Anderson
http://jchrisa.net
http://couch.io

Re: DB ACLs (was Re: 0.11 Release / Feature Freeze for 1.0)

Posted by Brian Candler <B....@pobox.com>.
On Tue, Feb 16, 2010 at 09:56:57AM -0800, Chris Anderson wrote:
> This is just an artifact of following trunk. Because _admins used to
> store a mixed list of names and roles, users upgrading from 0.10.x
> will lose any defined db-admins. I think this is the most secure
> choice.

Just to be clear, I didn't exactly lose the admins. The data structure on
disk had the admins, but the proplist had an atom key 'admins' instead of a
binary key <<"admins">>. Similarly, there were atoms for 'names' and 'roles'
instead of binaries <<"names">> and <<"roles">>

So proplist:get_value(<<"admins">>, SecProp) wasn't finding
{admins,{...}}

It just looks like the on-disk format has changed at some point.

B.

Re: DB ACLs (was Re: 0.11 Release / Feature Freeze for 1.0)

Posted by Chris Anderson <jc...@apache.org>.
On Tue, Feb 16, 2010 at 3:12 AM, Brian Candler <B....@pobox.com> wrote:
> And a bit more debugging:
>
> diff --git a/src/couchdb/couch_db.erl b/src/couchdb/couch_db.erl
> index 6d5da15..c2826e3 100644
> --- a/src/couchdb/couch_db.erl
> +++ b/src/couchdb/couch_db.erl
> @@ -238,7 +238,9 @@ check_is_admin(#db{user_ctx=#user_ctx{name=Name,roles=Roles}}=Db) ->
>     AdminRoles -> % same list, not an admin role
>         case AdminNames -- [Name] of
>         AdminNames -> % same names, not an admin
> -            throw({unauthorized, <<"You are not a db or server admin.">>});
> +            %% throw({unauthorized, <<"You are not a db or server admin.">>});
> +            Msg = list_to_binary(io_lib:format("Bah. Admins=~p, AdminNames=~p, AdminRoles=~p, Name=~p, Roles=~p", [Admins, AdminNames, AdminRoles, Name, Roles])),
> +            throw({unauthorized, Msg});
>         _ ->
>             ok
>         end;
> @@ -273,12 +275,15 @@ check_is_reader(#db{user_ctx=#user_ctx{name=Name,roles=Roles}=UserCtx}=Db) ->
>     end.
>
>  get_admins(#db{security=SecProps}) ->
> -    proplists:get_value(<<"admins">>, SecProps, {[]}).
> +    Admins = proplists:get_value(<<"admins">>, SecProps, {[]}),
> +    ?LOG_INFO("In get_admins: SecProps=~p Admins=~p", [SecProps, Admins]),
> +    Admins.
>
>  get_readers(#db{security=SecProps}) ->
>     proplists:get_value(<<"readers">>, SecProps, {[]}).
>
>  get_security(#db{security=SecProps}) ->
> +    ?LOG_INFO("In get_security: SecProps=~p", [SecProps]),
>     {SecProps}.
>
>  set_security(#db{update_pid=Pid}=Db, {NewSecProps}) when is_list(NewSecProps) ->
>
> Results:
>
> [Tue, 16 Feb 2010 10:54:12 GMT] [info] [<0.94.0>] In get_admins: SecProps=[{admins,{[{names,[<<"brianadmin">>]},{roles,[]}]}},
>                         {readers,{[{names,[]},{roles,[]}]}},
>                         {sec_obj,{[{<<"foo">>,<<"bar">>}]}}] Admins={[]}
>
> [Tue, 16 Feb 2010 10:54:12 GMT] [info] [<0.94.0>] In get_security: SecProps=[{admins,{[{names,[<<"brianadmin">>]},{roles,[]}]}},
>                           {readers,{[{names,[]},{roles,[]}]}},
>                           {sec_obj,{[{<<"foo">>,<<"bar">>}]}}]
>
> [Tue, 16 Feb 2010 10:54:12 GMT] [info] [<0.94.0>] 127.0.0.1 - - 'GET' /briantest/_security 200
>
> [Tue, 16 Feb 2010 10:54:25 GMT] [info] [<0.106.0>] In get_admins: SecProps=[{admins,{[{names,[<<"brianadmin">>]},{roles,[]}]}},
>                         {readers,{[{names,[<<"brian">>]},{roles,[]}]}},
>                         {sec_obj,{[]}}] Admins={[]}
>
> [Tue, 16 Feb 2010 10:54:25 GMT] [info] [<0.106.0>] In get_admins: SecProps=[{admins,{[{names,[<<"brianadmin">>]},{roles,[]}]}},
>                         {readers,{[{names,[]},{roles,[]}]}},
>                         {sec_obj,{[{<<"foo">>,<<"bar">>}]}}] Admins={[]}
>
> [Tue, 16 Feb 2010 10:54:25 GMT] [info] [<0.106.0>] In get_admins: SecProps=[{admins,{[{names,[<<"brianadmin">>]},{roles,[]}]}},
>                         {readers,{[{names,[]},{roles,[]}]}},
>                         {sec_obj,{[{<<"foo">>,<<"bar">>}]}}] Admins={[]}
>
> [Tue, 16 Feb 2010 10:54:25 GMT] [info] [<0.106.0>] 127.0.0.1 - - 'PUT' /briantest/_design/foo 401
>
> The problem is that in my database, SecProps has atom admins as the proplist
> key, but get_admins is looking for binary <<"admins">> as the key.
>
> However, on a freshly-created database, binaries *are* used as keys:
>
> [Tue, 16 Feb 2010 10:59:05 GMT] [info] [<0.126.0>] In get_security: SecProps=[{<<"admins">>,
>                            {[{<<"names">>,[<<"brianadmin">>]},
>                              {<<"roles">>,[]}]}},
>                           {<<"readers">>,
>                            {[{<<"names">>,[]},{<<"roles">>,[]}]}},
>                           {<<"sec_obj">>,{[{<<"foo">>,<<"bar">>}]}}]
>
> [Tue, 16 Feb 2010 10:59:05 GMT] [info] [<0.126.0>] 127.0.0.1 - - 'GET' /testdb/_security 200
>
> and I realised I could fix the problem by a GET followed by a PUT of the
> _security document.
>
> $ curl http://127.0.0.1:5984/briantest/_security | curl -X PUT --data-binary @- http://admin:admin@127.0.0.1:5984/briantest/_security
>
>
> So I would summarise this as:
>
> (1) whatever mechanism merged my old _readers and _admins into _security got
> it wrong (but I haven't found that mechanism yet)
>
> (2) the code which retrieves _security and returns it as JSON to the browser
> is unable to distinguish atoms from binaries, and so it gives a false
> impression to the user that the security settings have been set in a
> particular way, when they have not.
>

This is just an artifact of following trunk. Because _admins used to
store a mixed list of names and roles, users upgrading from 0.10.x
will lose any defined db-admins. I think this is the most secure
choice. Also, I think very few people defined db-admins prior to 0.11.

> I don't know if this is worth opening a JIRA ticket for, because maybe it
> only affects databases which have been following trunk.  But anything which
> means that security behaves differently to how it appears to have been
> configured makes me nervous.
>
> Regards,
>
> Brian.
>



-- 
Chris Anderson
http://jchrisa.net
http://couch.io

Re: DB ACLs (was Re: 0.11 Release / Feature Freeze for 1.0)

Posted by Brian Candler <B....@pobox.com>.
And a bit more debugging:

diff --git a/src/couchdb/couch_db.erl b/src/couchdb/couch_db.erl
index 6d5da15..c2826e3 100644
--- a/src/couchdb/couch_db.erl
+++ b/src/couchdb/couch_db.erl
@@ -238,7 +238,9 @@ check_is_admin(#db{user_ctx=#user_ctx{name=Name,roles=Roles}}=Db) ->
     AdminRoles -> % same list, not an admin role
         case AdminNames -- [Name] of
         AdminNames -> % same names, not an admin
-            throw({unauthorized, <<"You are not a db or server admin.">>});
+            %% throw({unauthorized, <<"You are not a db or server admin.">>});
+            Msg = list_to_binary(io_lib:format("Bah. Admins=~p, AdminNames=~p, AdminRoles=~p, Name=~p, Roles=~p", [Admins, AdminNames, AdminRoles, Name, Roles])),
+            throw({unauthorized, Msg});
         _ ->
             ok
         end;
@@ -273,12 +275,15 @@ check_is_reader(#db{user_ctx=#user_ctx{name=Name,roles=Roles}=UserCtx}=Db) ->
     end.
 
 get_admins(#db{security=SecProps}) ->
-    proplists:get_value(<<"admins">>, SecProps, {[]}).
+    Admins = proplists:get_value(<<"admins">>, SecProps, {[]}),
+    ?LOG_INFO("In get_admins: SecProps=~p Admins=~p", [SecProps, Admins]),
+    Admins.
 
 get_readers(#db{security=SecProps}) ->
     proplists:get_value(<<"readers">>, SecProps, {[]}).
 
 get_security(#db{security=SecProps}) ->
+    ?LOG_INFO("In get_security: SecProps=~p", [SecProps]),
     {SecProps}.
 
 set_security(#db{update_pid=Pid}=Db, {NewSecProps}) when is_list(NewSecProps) ->

Results:

[Tue, 16 Feb 2010 10:54:12 GMT] [info] [<0.94.0>] In get_admins: SecProps=[{admins,{[{names,[<<"brianadmin">>]},{roles,[]}]}},
                         {readers,{[{names,[]},{roles,[]}]}},
                         {sec_obj,{[{<<"foo">>,<<"bar">>}]}}] Admins={[]}

[Tue, 16 Feb 2010 10:54:12 GMT] [info] [<0.94.0>] In get_security: SecProps=[{admins,{[{names,[<<"brianadmin">>]},{roles,[]}]}},
                           {readers,{[{names,[]},{roles,[]}]}},
                           {sec_obj,{[{<<"foo">>,<<"bar">>}]}}]

[Tue, 16 Feb 2010 10:54:12 GMT] [info] [<0.94.0>] 127.0.0.1 - - 'GET' /briantest/_security 200

[Tue, 16 Feb 2010 10:54:25 GMT] [info] [<0.106.0>] In get_admins: SecProps=[{admins,{[{names,[<<"brianadmin">>]},{roles,[]}]}},
                         {readers,{[{names,[<<"brian">>]},{roles,[]}]}},
                         {sec_obj,{[]}}] Admins={[]}

[Tue, 16 Feb 2010 10:54:25 GMT] [info] [<0.106.0>] In get_admins: SecProps=[{admins,{[{names,[<<"brianadmin">>]},{roles,[]}]}},
                         {readers,{[{names,[]},{roles,[]}]}},
                         {sec_obj,{[{<<"foo">>,<<"bar">>}]}}] Admins={[]}

[Tue, 16 Feb 2010 10:54:25 GMT] [info] [<0.106.0>] In get_admins: SecProps=[{admins,{[{names,[<<"brianadmin">>]},{roles,[]}]}},
                         {readers,{[{names,[]},{roles,[]}]}},
                         {sec_obj,{[{<<"foo">>,<<"bar">>}]}}] Admins={[]}

[Tue, 16 Feb 2010 10:54:25 GMT] [info] [<0.106.0>] 127.0.0.1 - - 'PUT' /briantest/_design/foo 401

The problem is that in my database, SecProps has atom admins as the proplist
key, but get_admins is looking for binary <<"admins">> as the key.

However, on a freshly-created database, binaries *are* used as keys:

[Tue, 16 Feb 2010 10:59:05 GMT] [info] [<0.126.0>] In get_security: SecProps=[{<<"admins">>,
                            {[{<<"names">>,[<<"brianadmin">>]},
                              {<<"roles">>,[]}]}},
                           {<<"readers">>,
                            {[{<<"names">>,[]},{<<"roles">>,[]}]}},
                           {<<"sec_obj">>,{[{<<"foo">>,<<"bar">>}]}}]

[Tue, 16 Feb 2010 10:59:05 GMT] [info] [<0.126.0>] 127.0.0.1 - - 'GET' /testdb/_security 200

and I realised I could fix the problem by a GET followed by a PUT of the
_security document.

$ curl http://127.0.0.1:5984/briantest/_security | curl -X PUT --data-binary @- http://admin:admin@127.0.0.1:5984/briantest/_security


So I would summarise this as:

(1) whatever mechanism merged my old _readers and _admins into _security got
it wrong (but I haven't found that mechanism yet)

(2) the code which retrieves _security and returns it as JSON to the browser
is unable to distinguish atoms from binaries, and so it gives a false
impression to the user that the security settings have been set in a
particular way, when they have not.

I don't know if this is worth opening a JIRA ticket for, because maybe it
only affects databases which have been following trunk.  But anything which
means that security behaves differently to how it appears to have been
configured makes me nervous.

Regards,

Brian.

Re: DB ACLs (was Re: 0.11 Release / Feature Freeze for 1.0)

Posted by Brian Candler <B....@pobox.com>.
On Tue, Feb 09, 2010 at 07:30:37PM +0000, Brian Candler wrote:
> > couch_db:check_is_admin() should allow access in this case.
> > 
> > If you can reliably reproduce this, I'd like to fix it.
> 
> Yes, I can reliably reproduce.

I have just upgraded to latest trunk and for some reason I can't reproduce
in a fresh db, but my existing db shows it.

I added some extra debugging in check_is_admin:

diff --git a/src/couchdb/couch_db.erl b/src/couchdb/couch_db.erl
index 6d5da15..6b033f5 100644
--- a/src/couchdb/couch_db.erl
+++ b/src/couchdb/couch_db.erl
@@ -238,7 +238,9 @@ check_is_admin(#db{user_ctx=#user_ctx{name=Name,roles=Roles}}=Db) ->
     AdminRoles -> % same list, not an admin role
         case AdminNames -- [Name] of
         AdminNames -> % same names, not an admin
-            throw({unauthorized, <<"You are not a db or server admin.">>});
+            %% throw({unauthorized, <<"You are not a db or server admin.">>});
+            Msg = list_to_binary(io_lib:format("Bah. Admins=~p, AdminNames=~p, AdminRoles=~p, Name=~p, Roles=~p", [Admins, AdminNames, AdminRoles, Name, Roles])),
+            throw({unauthorized, Msg});
         _ ->
             ok
         end;

Now this is what I see:

$ curl http://127.0.0.1:5984/briantest/_security
{"admins":{"names":["brianadmin"],"roles":[]},"readers":{"names":[],"roles":[]},"sec_obj":{"foo":"bar"}}
$ curl -X PUT -d '{}' http://brianadmin:brianadmin@127.0.0.1:5984/briantest/_design/foo
{"error":"unauthorized","reason":"Bah. Admins=[], AdminNames=[], AdminRoles=[<<\"_admin\">>], Name=<<\"brianadmin\">>, Roles=[]"}

So whilst reading the _security document via HTTP shows "brianadmin" as an
admin name, for some reason AdminNames is empty in check_is_admin.

This seems very bizarre to me: getting _security returns SecProps, but
get_admins just picks out the "admins" member of SecProps.

Regards,

Brian.

Re: DB ACLs (was Re: 0.11 Release / Feature Freeze for 1.0)

Posted by Brian Candler <B....@pobox.com>.
On Tue, Feb 09, 2010 at 09:04:49AM -0800, Chris Anderson wrote:
> If you do a get against /_session does it show you as an admin?

Yes (see below)

> couch_db:check_is_admin() should allow access in this case.
> 
> If you can reliably reproduce this, I'd like to fix it.

Yes, I can reliably reproduce. Here is a brand new installation of trunk,
still in admin party mode:

$ curl -X PUT http://127.0.0.1:5984/zzz
{"ok":true}
$ curl -X PUT -d '{"names":["foo"]}'
http://127.0.0.1:5984/zzz/_readers
{"ok":true}
$ curl -X POST -d '{"map":"function(){}"}' http://127.0.0.1:5984/zzz/_temp_view
{"error":"unauthorized","reason":"You are not authorized to access this db."}
$ curl http://127.0.0.1:5984/
{"couchdb":"Welcome","version":"0.11.0bcc31819f-git"}
$ curl http://127.0.0.1:5984/_session
{"ok":true,"userCtx":{"name":null,"roles":["_admin"]},"info":{"authentication_db":"_users","authentication_handlers":["oauth","cookie","default"],"authenticated":"default"}}

Regards,

Brian.

Re: DB ACLs (was Re: 0.11 Release / Feature Freeze for 1.0)

Posted by Chris Anderson <jc...@apache.org>.
On Tue, Feb 9, 2010 at 2:04 AM, Brian Candler <B....@pobox.com> wrote:
> On Tue, Feb 09, 2010 at 10:00:53AM +0000, Brian Candler wrote:
>> Even going back to Admin Party it doesn't work:
>>
>> $ curl -X POST -d '{"map":"function(doc) {}"}' http://127.0.0.1:5984/briantest/_temp_view
>> {"error":"unauthorized","reason":"You are not authorized to access this db."}
>>
>> However I'm a bit perplexed as to why view_errors.js in the test suite is
>> still passing.
>
> I can see now. I had a non-empty _readers list, and for some reason this was
> preventing even system-level-admin or admin party mode from accessing
> _temp_view. Removing this resource made it work.

If you do a get against /_session does it show you as an admin?

couch_db:check_is_admin() should allow access in this case.

If you can reliably reproduce this, I'd like to fix it.

>
> $ curl http://127.0.0.1:5984/briantest/_readers
> {"names":["brian","brianadmin"],"roles":[]}
>
> $ curl http://127.0.0.1:5984/briantest/_admins
> {"names":["brianadmin"],"roles":[]}
>
> $ curl -d '{}' -X PUT http://127.0.0.1:5984/briantest/_readers
> {"ok":true}
>
> $ curl -X POST -d '{"map":"function(doc) { emit(JSON.stringify(123,null)); }"}' http://127.0.0.1:5984/briantest/_temp_view
> {"total_rows":0,"offset":0,"rows":[]}
>



-- 
Chris Anderson
http://jchrisa.net
http://couch.io

Re: DB ACLs (was Re: 0.11 Release / Feature Freeze for 1.0)

Posted by Brian Candler <B....@pobox.com>.
On Tue, Feb 09, 2010 at 10:00:53AM +0000, Brian Candler wrote:
> Even going back to Admin Party it doesn't work:
> 
> $ curl -X POST -d '{"map":"function(doc) {}"}' http://127.0.0.1:5984/briantest/_temp_view
> {"error":"unauthorized","reason":"You are not authorized to access this db."}
> 
> However I'm a bit perplexed as to why view_errors.js in the test suite is
> still passing.

I can see now. I had a non-empty _readers list, and for some reason this was
preventing even system-level-admin or admin party mode from accessing
_temp_view. Removing this resource made it work.

$ curl http://127.0.0.1:5984/briantest/_readers
{"names":["brian","brianadmin"],"roles":[]}

$ curl http://127.0.0.1:5984/briantest/_admins
{"names":["brianadmin"],"roles":[]}

$ curl -d '{}' -X PUT http://127.0.0.1:5984/briantest/_readers
{"ok":true}

$ curl -X POST -d '{"map":"function(doc) { emit(JSON.stringify(123,null)); }"}' http://127.0.0.1:5984/briantest/_temp_view
{"total_rows":0,"offset":0,"rows":[]}

Re: DB ACLs (was Re: 0.11 Release / Feature Freeze for 1.0)

Posted by Brian Candler <B....@pobox.com>.
Unless I'm doing something daft, it seems that readeracl is blocking me from
accessing _temp_view, even as a database admin or a system admin.

$ curl http://brianadmin:brianadmin@127.0.0.1:5984/briantest/_admins
{"names":["brianadmin"],"roles":[]}

$ curl http://brianadmin:brianadmin@127.0.0.1:5984/briantest/_readers
{"names":["brian","brianadmin"],"roles":[]}

$ curl http://brianadmin:brianadmin@127.0.0.1:5984/briantest/foo
{"_id":"foo","_rev":"2-7051cbe5c8faecd085a3fa619e6e6337"}

$ curl -X POST -d '{}' http://brianadmin:brianadmin@127.0.0.1:5984/briantest/_temp_view
{"error":"unauthorized","reason":"You are not authorized to access this db."}

$ curl -X POST -d '{}' http://admin:admin@127.0.0.1:5984/briantest/_temp_view
{"error":"unauthorized","reason":"You are not authorized to access this db."}

Even going back to Admin Party it doesn't work:

$ curl -X POST -d '{"map":"function(doc) {}"}' http://127.0.0.1:5984/briantest/_temp_view
{"error":"unauthorized","reason":"You are not authorized to access this db."}

However I'm a bit perplexed as to why view_errors.js in the test suite is
still passing.

Regards,

Brian.

Re: DB ACLs (was Re: 0.11 Release / Feature Freeze for 1.0)

Posted by Chris Anderson <jc...@apache.org>.
On Mon, Feb 8, 2010 at 8:19 AM, Brian Candler <B....@pobox.com> wrote:
> Sorry to drag on about this, but I have another problem with the readeracl
> branch.
>
> As far as I can tell, it's impossible to test in validate_doc_update whether
> the user is listed in the database-level _admins resource.

This is a good example of a use case for putting all this stuff in a
single object.

Sorry I haven't had a chance to respond to your earlier message yet,
superbowl, etc. etc.

Chris

> Example:
>
> $ curl http://brianadmin:brianadmin@127.0.0.1:5984/briantest/_admins
> {"names":["brianadmin"],"roles":[]}
>
> $ curl -X PUT -d'{"validate_doc_update":"function(newDoc,oldDoc,userCtx,secObj) {if (userCtx.roles.indexOf(\"_admin\") == -1) {throw {unauthorized: \"You are not an admin\"+toJSON(userCtx)};}}"}' 'http://brianadmin:brianadmin@127.0.0.1:5984/briantest/_design/testadmin'
> {"ok":true,"id":"_design/testadmin","rev":"1-3a33960e65f156773478f357aaf67471"}
>
> $ curl -X PUT -d "{}" http://brianadmin:brianadmin@127.0.0.1:5984/briantest/foo
> {"error":"unauthorized","reason":"You are not an admin{\"db\":\"briantest\",\"name\":\"brianadmin\",\"roles\":[]}"}
>
> That is, the "_admin" role is only visible here if you are a server-wide
> admin, not if you are a database-level admin.
>
> As a result, if you want database-level admins to have any special
> privileges with regards to updating documents, you have to list them again
> in the _security document.
>
> I don't think this is clear in the security_validation.js test, because it
> uses X-Couch-Test-Auth, which gives _admin role when not logged in as any
> particular user. So
>
>      T(db.setDbProperty("_security", {admin_override : true}).ok);
>      T(db.save(doc).ok);
>
> only works because the _admin role is set here, not because the user is in
> _admins.
>
> Regards,
>
> Brian.
>



-- 
Chris Anderson
http://jchrisa.net
http://couch.io

Re: DB ACLs (was Re: 0.11 Release / Feature Freeze for 1.0)

Posted by Brian Candler <B....@pobox.com>.
Sorry to drag on about this, but I have another problem with the readeracl
branch.

As far as I can tell, it's impossible to test in validate_doc_update whether
the user is listed in the database-level _admins resource.  Example:

$ curl http://brianadmin:brianadmin@127.0.0.1:5984/briantest/_admins
{"names":["brianadmin"],"roles":[]}

$ curl -X PUT -d'{"validate_doc_update":"function(newDoc,oldDoc,userCtx,secObj) {if (userCtx.roles.indexOf(\"_admin\") == -1) {throw {unauthorized: \"You are not an admin\"+toJSON(userCtx)};}}"}' 'http://brianadmin:brianadmin@127.0.0.1:5984/briantest/_design/testadmin'
{"ok":true,"id":"_design/testadmin","rev":"1-3a33960e65f156773478f357aaf67471"}

$ curl -X PUT -d "{}" http://brianadmin:brianadmin@127.0.0.1:5984/briantest/foo
{"error":"unauthorized","reason":"You are not an admin{\"db\":\"briantest\",\"name\":\"brianadmin\",\"roles\":[]}"}

That is, the "_admin" role is only visible here if you are a server-wide
admin, not if you are a database-level admin.

As a result, if you want database-level admins to have any special
privileges with regards to updating documents, you have to list them again
in the _security document.

I don't think this is clear in the security_validation.js test, because it
uses X-Couch-Test-Auth, which gives _admin role when not logged in as any
particular user. So

      T(db.setDbProperty("_security", {admin_override : true}).ok);
      T(db.save(doc).ok);

only works because the _admin role is set here, not because the user is in
_admins.

Regards,

Brian.

Re: DB ACLs (was Re: 0.11 Release / Feature Freeze for 1.0)

Posted by Brian Candler <B....@pobox.com>.
On Sat, Feb 06, 2010 at 10:52:57AM -0800, Chris Anderson wrote:
> I'd be pretty surprised if the ACLs that ship with 0.11 are
> significantly different from what I committed last week.

I thought so, which is why I haven't coded up any alternative in erlang.

> Can you explain your application needs? I'm pretty sure that the
> current ACLs can support them. But if you have a common use case falls
> out side what can be done with readers / admins / validations, maybe
> there are minimal tweaks we can do to make it easier for you.

OK. Let me start by outlining what I have now; you'll see that in many ways
this *does* align well to couchdb's new security model.

>From the user's point of view
-----------------------------

A user signs up for a database. They are its initial administrator.

They can grant access to other users. As well as read access, they can
grant:
   [X] Update
   [X] Admin
rights to those other users. "Update" allows database writes, and "Admin"
allows adding new users and changing their update/admin rights flags. I plan
to make "Update" more granular in future.

For safety, an admin cannot remove themselves, nor change their own rights
flags.  They have to ask another admin on the database to do it for them. 
That's so that databases can't end up being accidentally orphaned, without
any administrator, which would be a dead-end situation.

When you login as a user, you could have access to multiple databases.  In
that case you are given a list to choose from.

For audit purposes, all saves automatically update the "updated_by" and
"updated_at" fields (with username and timestamp respectively).  This is
enforced for "Update" users.  However an admin user is allowed to import a
CSV file containing "updated_by" and "updated_at" columns, and those
uploaded values are honoured.  That is to make it possible to accurately
restore a database from a CSV file.

So, an admin user could destroy the audit information if they wanted to, but
regular update users cannot.

The current implementation
--------------------------

Each database is a separate couchdb database, with all mediation via a Rails
app.

There is a global authorisation database with "User" and "Db" docs.

"User" contains usernames and passwords, and system-wide prefs for that user
(e.g. timezone)

There is one "Db" doc per database, and its id is the same as the database
name. It contains
  {
    "authz":{
      "user1":[],
      "user2":["admin"],
    }
  }
plus database-level preferences. By putting these docs in the global database
rather than the databases to which they refer, I can use a view to find all
databases that a user has access to.

These docs give me what _readers, _admins and _security will provide.

Issues with changing to couchdb native features
-----------------------------------------------

(1) I could implement the "update" right in couchdb using the _security
document, perhaps putting something like

  {"updaters":{
    "names":[],
    "roles":[]
  }

to be similar to _readers and _admins. This means that the authz info is
spread across three places.  In futon you either won't be able to set the
"update" right at all, or you'll have to do it in a completely different way
to readers and admins.

(2) In couchdb, an "admin" confers two distinct rights: the ability to change
rights on the database, and the ability to modify design docs.  My "admin"
right is only the first of these.

So I have to either: (a) bite the bullet and let admin users be full
database admins too, or (b) implement a more restricted "manager" right
myself.  This would require and external process sitting in front of
couchdb, which ran with admin rights, and mediated requests for adding and
removing users and rights.

This would not be necessary if the access controls were in a "real" doc,
because I could use validate_doc_update to limit who could add what access
(in the same way as validate_doc_update already works in the _users
database)

(3) There's no concurrency control on _readers, _admins or _security. It
*will* break if two people try to change them at once.

Again, the workaround is for me not to grant any real database _admins, but
have a front-end manager process which performs these changes.  The
front-end will need to contain a mutex to serialise these requests.

(4) I am currently backing up the system just by dumping _all_docs, and
there's enough info to restore the whole system from there.  It's line-based
and git-packs nicely.

However once there are hidden _reader, _admin and _security resources, those
will have to be backed up separately for each database.

(5) I won't be able to use _all_dbs to show what databases the user has
access to, either because it will show to much, or because it will be
blocked to non-server-admins.

This isn't a huge problem if I ask the user to type in the database name
they want to access the first time they use it, and then I store it in some
sort of persistent cache, so that next time they can select it from a list.

A spare field in their _users database entry would be the obvious place, but
it's not clear if this will be allowed going forward.
 
(Futon uses a persistent cookie, I believe, which wouldn't work well for me
because a user switching to a different browser or PC would lose the list of
databases they had access to. Maybe using _users for the list of recent
databases would be better for futon too).

(6) Other privacy issues as described before: I don't want user A to see
that user X exists on the system, and certainly not to see their encrypted
password.  Furthermore I don't want user A to know the names of any
databases that they don't have access to.

I can workaround the first by blocking read access to _users and
implementing privileged services in front for changing passwords and
preferences. The second requires me to put couchdb behind a HTTP reverse
proxy, unless _all_dbs becomes an admin-only resource as has been proposed.

(7) Timestamps are slightly awkward. I can update timestamps in an _update
function, but not prevent users from going direct to the database.  The best
I can think of is to add a second layer of checks in validate_doc_update:
for example, that the "updated_at" timestamp must be plus or minus two
minutes from the current time, and "updated_by" field must match the
userctx.name.

So in summary, the main points are:

* I can't trust end-users to be _admins, because there are too many ways
  they can shoot themselves in the foot, and couchdb doesn't provide
  validate_doc_update level of control. Result: I have to build external
  processes to mediate admin-type requests, and document the APIs.

* It's all just rather messy. To check if they're an _admin I need to look
  in userctx. To check if they have update rights I need to look in the
  _security resource. To enable access to the database I need to list them
  in the _readers resource.

Regards,

Brian.

RE: DB ACLs (was Re: 0.11 Release / Feature Freeze for 1.0)

Posted by James Hayton <th...@purplebulldog.com>.
I agree with Jan.  The goal should be to achieve the best design possible at
this point.  (Not saying Brian's idea is better that what Chris has
committed because I don't have the knowledge to really know either way.) The
point is authorization is a really important "feature" and we shouldn't
really be worrying about existing client libraries prior to a 1.0 release
imho.  Getting it right is much more important.  I am starting to get
concerned about the current plans regarding authorization and releasing 1.0,
but I guess I should reply to Damien's thread about that.  

James Hayton

-----Original Message-----
From: Jan Lehnardt [mailto:jan@apache.org] 
Sent: Saturday, February 06, 2010 11:38 AM
To: dev@couchdb.apache.org
Subject: Re: DB ACLs (was Re: 0.11 Release / Feature Freeze for 1.0)


On 6 Feb 2010, at 10:52, Chris Anderson wrote:

> On Sat, Feb 6, 2010 at 1:58 AM, Brian Candler <B....@pobox.com> wrote:
>> A thought.
>> 
>> The current security model requires you spread information in three
places:
>> _readers (has access to database); _security (for controlling updates);
>> _admins (can update security and ddocs)
>> 
>> What if this was replaced by a single resource, _security, structured
like
>> this?
>> 
>>   {
>>     "names":{
>>       "foo":[],
>>       "bar":["_admin"]
>>     },
>>     "roles":{
>>       "baz":["doctor"]
>>     }
>>     "otherstuff"
>>   }
>> 
>> You get reader access if your name or role appears as key in the relevant
>> section, regardless of the value.  So [] means reader only.  You get
admin
>> access if the value contains "_admin". Other rights are available for the
>> application to use, and may affect validate_doc_update.
>> 
>> The reasons I propose this are as follows:
>> 
>> 1. it's extensible without having to change the DB structure again
>> 2. application-specific rights can be administered in the same way as
>>   couchdb system rights
>> 3. futon can display this clearly - a list of names and a list of roles,
>>   each with their rights shown next to it
>> 4. a hash key is quicker to look up for access control than an array scan
>> 
>> However it retains the current fundamental design, which is that the
rights
>> are stored in a resource within the database, which is not a document.
>> 
>> Thoughts?
>> 
>> What I've really done is to flip things around, so that
>> 
>>    "readers":["name1","name2"]
>>    "admins":["name3"]
>> 
>> becomes
>> 
>>    {"name1":["reader"], "name2":["reader"], "name3":["admin"]}
>> 
>> But doing that also makes it easier to add a future auth model where
rights
>> are stored in a more scalable way (e.g. as separate user documents or in
an
>> external LDAP database), where such information is keyed by username or
>> role.
> 
> I'm curious how this model supports the common case of a publicly
> accessible database with administrators?
> 
> As far as replacing the _admins _readers and _security database values
> with single complex value, the only reason I didn't do that is that we
> already have a bunch of client libraries that know about _admins. The
> underlying storage treats it all as one object. I don't think it
> matters either way, but I lean toward the status quo because that is
> code that has already been written.

I think this notion is dangerous. If we can opt for a better design, now
is the time to do it.

Cheers
Jan
--



> 
> I'd be pretty surprised if the ACLs that ship with 0.11 are
> significantly different from what I committed last week.
> 
> Can you explain your application needs? I'm pretty sure that the
> current ACLs can support them. But if you have a common use case falls
> out side what can be done with readers / admins / validations, maybe
> there are minimal tweaks we can do to make it easier for you.
> 
> Chris
> 
>> 
>> Regards,
>> 
>> Brian.
>> 
> 
> 
> 
> -- 
> Chris Anderson
> http://jchrisa.net
> http://couch.io



Re: DB ACLs (was Re: 0.11 Release / Feature Freeze for 1.0)

Posted by Jan Lehnardt <ja...@apache.org>.
On 6 Feb 2010, at 10:52, Chris Anderson wrote:

> On Sat, Feb 6, 2010 at 1:58 AM, Brian Candler <B....@pobox.com> wrote:
>> A thought.
>> 
>> The current security model requires you spread information in three places:
>> _readers (has access to database); _security (for controlling updates);
>> _admins (can update security and ddocs)
>> 
>> What if this was replaced by a single resource, _security, structured like
>> this?
>> 
>>   {
>>     "names":{
>>       "foo":[],
>>       "bar":["_admin"]
>>     },
>>     "roles":{
>>       "baz":["doctor"]
>>     }
>>     "otherstuff"
>>   }
>> 
>> You get reader access if your name or role appears as key in the relevant
>> section, regardless of the value.  So [] means reader only.  You get admin
>> access if the value contains "_admin". Other rights are available for the
>> application to use, and may affect validate_doc_update.
>> 
>> The reasons I propose this are as follows:
>> 
>> 1. it's extensible without having to change the DB structure again
>> 2. application-specific rights can be administered in the same way as
>>   couchdb system rights
>> 3. futon can display this clearly - a list of names and a list of roles,
>>   each with their rights shown next to it
>> 4. a hash key is quicker to look up for access control than an array scan
>> 
>> However it retains the current fundamental design, which is that the rights
>> are stored in a resource within the database, which is not a document.
>> 
>> Thoughts?
>> 
>> What I've really done is to flip things around, so that
>> 
>>    "readers":["name1","name2"]
>>    "admins":["name3"]
>> 
>> becomes
>> 
>>    {"name1":["reader"], "name2":["reader"], "name3":["admin"]}
>> 
>> But doing that also makes it easier to add a future auth model where rights
>> are stored in a more scalable way (e.g. as separate user documents or in an
>> external LDAP database), where such information is keyed by username or
>> role.
> 
> I'm curious how this model supports the common case of a publicly
> accessible database with administrators?
> 
> As far as replacing the _admins _readers and _security database values
> with single complex value, the only reason I didn't do that is that we
> already have a bunch of client libraries that know about _admins. The
> underlying storage treats it all as one object. I don't think it
> matters either way, but I lean toward the status quo because that is
> code that has already been written.

I think this notion is dangerous. If we can opt for a better design, now
is the time to do it.

Cheers
Jan
--



> 
> I'd be pretty surprised if the ACLs that ship with 0.11 are
> significantly different from what I committed last week.
> 
> Can you explain your application needs? I'm pretty sure that the
> current ACLs can support them. But if you have a common use case falls
> out side what can be done with readers / admins / validations, maybe
> there are minimal tweaks we can do to make it easier for you.
> 
> Chris
> 
>> 
>> Regards,
>> 
>> Brian.
>> 
> 
> 
> 
> -- 
> Chris Anderson
> http://jchrisa.net
> http://couch.io


Re: DB ACLs (was Re: 0.11 Release / Feature Freeze for 1.0)

Posted by Chris Anderson <jc...@apache.org>.
On Sat, Feb 6, 2010 at 1:58 AM, Brian Candler <B....@pobox.com> wrote:
> A thought.
>
> The current security model requires you spread information in three places:
> _readers (has access to database); _security (for controlling updates);
> _admins (can update security and ddocs)
>
> What if this was replaced by a single resource, _security, structured like
> this?
>
>   {
>     "names":{
>       "foo":[],
>       "bar":["_admin"]
>     },
>     "roles":{
>       "baz":["doctor"]
>     }
>     "otherstuff"
>   }
>
> You get reader access if your name or role appears as key in the relevant
> section, regardless of the value.  So [] means reader only.  You get admin
> access if the value contains "_admin". Other rights are available for the
> application to use, and may affect validate_doc_update.
>
> The reasons I propose this are as follows:
>
> 1. it's extensible without having to change the DB structure again
> 2. application-specific rights can be administered in the same way as
>   couchdb system rights
> 3. futon can display this clearly - a list of names and a list of roles,
>   each with their rights shown next to it
> 4. a hash key is quicker to look up for access control than an array scan
>
> However it retains the current fundamental design, which is that the rights
> are stored in a resource within the database, which is not a document.
>
> Thoughts?
>
> What I've really done is to flip things around, so that
>
>    "readers":["name1","name2"]
>    "admins":["name3"]
>
> becomes
>
>    {"name1":["reader"], "name2":["reader"], "name3":["admin"]}
>
> But doing that also makes it easier to add a future auth model where rights
> are stored in a more scalable way (e.g. as separate user documents or in an
> external LDAP database), where such information is keyed by username or
> role.

I'm curious how this model supports the common case of a publicly
accessible database with administrators?

As far as replacing the _admins _readers and _security database values
with single complex value, the only reason I didn't do that is that we
already have a bunch of client libraries that know about _admins. The
underlying storage treats it all as one object. I don't think it
matters either way, but I lean toward the status quo because that is
code that has already been written.

I'd be pretty surprised if the ACLs that ship with 0.11 are
significantly different from what I committed last week.

Can you explain your application needs? I'm pretty sure that the
current ACLs can support them. But if you have a common use case falls
out side what can be done with readers / admins / validations, maybe
there are minimal tweaks we can do to make it easier for you.

Chris

>
> Regards,
>
> Brian.
>



-- 
Chris Anderson
http://jchrisa.net
http://couch.io

Re: DB ACLs (was Re: 0.11 Release / Feature Freeze for 1.0)

Posted by Brian Candler <B....@pobox.com>.
A thought.

The current security model requires you spread information in three places:
_readers (has access to database); _security (for controlling updates);
_admins (can update security and ddocs)

What if this was replaced by a single resource, _security, structured like
this?

   {
     "names":{
       "foo":[],
       "bar":["_admin"]
     },
     "roles":{
       "baz":["doctor"]
     }
     "otherstuff"
   }

You get reader access if your name or role appears as key in the relevant
section, regardless of the value.  So [] means reader only.  You get admin
access if the value contains "_admin". Other rights are available for the
application to use, and may affect validate_doc_update.

The reasons I propose this are as follows:

1. it's extensible without having to change the DB structure again
2. application-specific rights can be administered in the same way as
   couchdb system rights
3. futon can display this clearly - a list of names and a list of roles,
   each with their rights shown next to it
4. a hash key is quicker to look up for access control than an array scan

However it retains the current fundamental design, which is that the rights
are stored in a resource within the database, which is not a document.

Thoughts?

What I've really done is to flip things around, so that

    "readers":["name1","name2"]
    "admins":["name3"]

becomes

    {"name1":["reader"], "name2":["reader"], "name3":["admin"]}

But doing that also makes it easier to add a future auth model where rights
are stored in a more scalable way (e.g. as separate user documents or in an
external LDAP database), where such information is keyed by username or
role.

Regards,

Brian.

Re: DB ACLs (was Re: 0.11 Release / Feature Freeze for 1.0)

Posted by Brian Candler <B....@pobox.com>.
I think it's time to declare my interest.

The CouchDB-backed application I've been working on is now available here:
http://www.deploy2.net/

It's basically an IP address database / network asset inventory. CouchDB has
been a great match for this, especially some funky map-reduce logic for
indexing IPv4 and IPv6 addresses, and having separate database instances for
logical separation.

At the moment, CouchDB sits entirely behind a Rails app, which handles the
user authentication and authorization - that is, enforcing which databases
you can use, and whether you have read/update/admin rights to them.

If I can push enough of this logic down into CouchDB, then that would let me
expose CouchDB itself to the Internet.  That would give people a fast
JSON-over-HTTP API for extending the service externally.  It would also let
me move more of the UI logic up into the browser, and the option of moving
towards a CouchApp model.

So that's my particular reason for having an interest in how the ACL
features end up in 0.11.  I'd be happy to describe how my current security
model works and how it might be mapped down to CouchDB, if anyone's
interested.

(And if anyone's interested in the app itself, please feel free to try it
out and send your comments to me off-list)

Regards,

Brian.

Re: DB ACLs (was Re: 0.11 Release / Feature Freeze for 1.0)

Posted by Jan Lehnardt <ja...@apache.org>.
Hi James,

thanks for your thoughts. I do agree with most points. But I'd like to
propose a pragmatic way out.

I think Chris' auth design is pretty solid. We have been thinking about
this space for over two years now and this is first thing that makes me
happy.

Chris' auth design is also not "complete" as in you can have a public
CouchDB running that serves all your security needs. I don't think we
should try and pretend everything is as locked down as it is needed in
every scenario, but get a limited set of scenarios right and expand from
there. I think the current way is a good way forward. We can address
more scenarios further down the line.

If that means some, even many users have to keep combining CouchDB
with a HTTP proxy as they have done in the past, I'm okay with that.

If I thought Chris' design would be fundamentally flawed, I'd have a
different position, though.

We won't ever ship 1.0 if we want CouchDB to be "perfect".

Cheers
Jan
--



On 3 Feb 2010, at 15:33, James Hayton wrote:

> Hi Everyone-
> 
> I am just an end user of couch right now, but the development of these
> security features are important to me so I thought I would share my
> thoughts.  
> 
> In general and specifically regarding points 5 and 9, I have to agree very
> passionately with Brian.  There is no way that I want my users having access
> to the email addresses, usernames, password hashes, etc... of any other
> users on my system.  That would be a very bad thing and potentially even
> derail some sort of use cases I had thought of for couch in the future.   I
> would also prefer couch to be closed by default and open as needed, but this
> doesn't matter too much as long as I can close it down.  
> 
> I also like the idea of restricting _all_dbs to return only the databases
> that the user has access to.  I would accept it as an admin only resource if
> that’s what everyone thought was best or the only option, but it seems crazy
> to allow users to see the names of databases that they don’t have access to.
> There is no point imho.  They don't have access for a reason.  
> 
> Lastly I would have to say that I would much rather have 0.11 and the
> feature freeze for 1.0 held off for weeks or even months if required to get
> this security stuff worked out properly.  It's incredibly important before
> very many real world applications with sensitive data can actually be built.
> I really believe couch is an amazing piece of software that could create so
> many new opportunities by utilizing it's unique features such as replication
> and hosted applications.  But, the true potential of couch really depends on
> rock solid authorization capabilities.  Without that, I don’t think couch is
> truly ready for a 1.0 release.  Proper authorization probably isn't an easy
> thing to solve and I understand the desire to punt on this stuff and just
> get something working, but I just don't think it’s a good long term
> strategy.  
> 
> I hope I have not come off as unappreciative or demanding.  I truly
> appreciate all the work that you guys are doing regarding security and with
> couch in general.  I also understand that my opinion probably carries very
> little relative weight here since I am not a contributor, but I would really
> plead with everyone not to rush a release or freeze the api without really
> making sure that these issues have been thought through carefully, tested,
> etc...  
> 
> Thanks, 
> 
> James Hayton
> 
> 
> 
> 
> 
> -----Original Message-----
> From: Brian Candler [mailto:B.Candler@pobox.com] 
> Sent: Wednesday, February 03, 2010 1:24 PM
> To: dev@couchdb.apache.org
> Subject: Re: DB ACLs (was Re: 0.11 Release / Feature Freeze for 1.0)
> 
> On Wed, Feb 03, 2010 at 09:21:10AM -0800, Chris Anderson wrote:
>> Let me see if I can address some of these concerns.
> 
> Thank you for taking the time to reply in detail and to implement some of
> the changes.
> 
>>> I believe that in its current form, _all_dbs simply won't scale to
> millions
>>> of databases on a box if you want to limit it to accessible dbs only.
>> 
>> This is an interesting one. _all_dbs won't scale indefinitely even
>> before this patch, because it has no built-in pagination abilities.
>> Enhancing this feature to look into each file and keep going to till
>> it finds N that can be listed isn't hard to code. It will be a little
>> more work to make _all_dbs respect startkey and endkey.
> 
> I agree that it's not hard to code. What I mean is that it won't scale if
> the server has to open and read a million files on disk to find the two that
> you have access to.
> 
> Making _all_dbs an admin-only resource as Jan proposed is brutal but
> effective in protecting the server.  (Admins probably do want _all_dbs
> paginated, but that's a separate issue).  Futon would of course have to be
> changed so that non-admin users type in the name of the database they want
> to access.
> 
> So far this hasn't answered the question: why not put the authorization in
> the _users document instead? But I think we're getting to that :-)
> 
>>> (2) _readers is a single monolithic object. I believe that it won't
> scale to
>>> millions of users having access to the same database.
>> 
>> It's not meant to support this use case. If you have millions of users
>> with the same access rights, give them a common role and give that
>> role access to the database.
> 
> That doesn't scale either, because what couchdb calls "roles" are really
> what I'd call "groups". That is, they are a system-wide collection of users.
> They are only maintainable by a system-wide administrator.
> 
> What I'm thinking of is that database1 contains application1, with a
> collection of users.  database2 contains application2, with another
> collection of users, and so on.  These databases/applications are hosted on
> the same server, but belong to third parties.
> 
> IMO it's an unrealistic expectation for the database1 owner to come along to
> the system administrator and say:
> 
> 1. I'm having problems with scaling my _readers.
> 2. Please create a new role I can use, and apply this to all my existing
>   readers.
> 3. More importantly, every time I need to add a new user to my application,
>   I will come back to you and ask you to add this role that user.
> 
> Then for the database2 owner to come along and ask for the same.
> 
> That is, because roles are *system-wide* not *database-wide* then the
> management of them doesn't scale if you want to use them for database-level
> access controls.
> 
> Given the above: as system administrator you could decide to create roles
> like "database1:_reader" to simplify administration and avoid role name
> clashes.  You could even arrange the validate_doc_update in the _users
> database so that a delegated person in database1 is able to add and remove
> database1:* roles without having to trouble the systemwide admin.
> 
> But that's exactly what my proposal was. In which case, why can we not just
> use this mechanism in the first place?
> 
>>> (3) _readers has no concurrency control. One admin making an ACL change
> in
>>> futon (say) will silently overwrite changes made around the same time by
>>> another admin. This will get worse the more frequently users are added
> and
>>> removed.
>> 
>> _readers / _admins / _security are stored as a raw object without
>> concurrency control, because keeping them as a document adds too much
>> performance overhead on each request. Concurrency control is a
>> tradeoff we make here.
> 
> Sorry to be blunt, but do you have numbers to back that up?  This smells
> very much of premature optimisation.
> 
> In any case: if db:_reader and db:_admin are just roles, you have them in
> the userctx object already.  That's clearly *more* efficient than having
> them separately in the database.
> 
> _security is an edge case. I consider it as an adjunct to the design doc. 
> You could, after all, hardcode
> 
>    var security = { .... };
> 
> in the top of your validate_doc_update; it just avoids you having to touch
> your design doc so often.  Since there's only a single _security document
> it's going to end up cached anyway.
> 
>> The database-specfic roles and names don't belong in the users db. The
>> users db is for answering the question: "who is the user and what
>> roles do they have". The ACLs say which names and roles can read or
>> admin a given database.
>> 
>> It's a fact of life that users can rsync db-files around. If the names
>> / roles are in the users db, they get wrong when databases are moved
>> to another host or renamed on the current host.
> 
> The last sentence I agree with. The same is true if you delete a DB and
> recreate one with the same name.
> 
> However, database uuids were proposed recently. If the _users doc authorized
> against uuids rather than database names, would that issue be solved?
> (The ability to have a per-user _all_dbs view would be lost, if there wasn't
> a fast way to map a uuid back to a database name, but we've already decided
> we can live without that)
> 
>> 4 is fixed.
> 
> Thanks. It didn't even add any data privacy, since an _admin could always
> add themselves as a _reader anyway.
> 
>>> (5) Non-admin readers can view the entire _readers, _admins and
> _security
>>> resources.  I think this is quite a severe privacy concern, but it is
> easily
>>> fixed.
>> 
>> They can also read the design document. I'm not sure why this is a
>> privacy concern. A user may need to contact a db admin for help with
>> something, it's handy to be able to get a list of them. And it only
>> makes sense that you should be able to see the list of users who can
>> also access the same db you can.
>> 
>> If there's consensus that this is indeed an issue, it's not a hard
>> thing to change in the code.
> 
> I await what others say. However I would certainly *not* want the internal
> E-mail addresses of my admins being available to the whole world.  And as an
> end-user of a facebook-style application, I would not want my E-mail address
> known to every other user anywhere on that database.
> 
> For comparison: if you're granted SELECT, INSERT, UPDATE and DELETE
> privileges as an Oracle user, that does not mean you get to find out the
> usernames of the admins, or indeed any other users with rights to the same
> database.
> 
> And for comparison: if someone signs up for the dev@apache.org mailing list,
> they do not get to see the E-mail addresses of all the other members of this
> list, and nor should I.
> 
> (I think the latter comparison is fair; a couchapp BBS would be a very sweet
> thing to have)
> 
>>> (9) The _users db itself is world-readable (showing not only who your
> users
>>> are, but their password hashes). Highly undesirable.
>> 
>> I actually consider this a feature. We'd like to get some stronger
>> password hashing (see the bcrypt threads) which should help with the
>> password parts.
> 
> At the end of the day, bcrypt is still a hash of a password. Any password
> hash is open to off-line brute-force attack.  You can tune the cost with
> bcrypt, but dictionary attacks are still going to succeed for 90% of users. 
> You may be running couchdb on a modest server but your attacker is thousands
> of times more powerful than you, and can spend years doing it if they want.
> 
> Put it another way: if I suggested that people should start making
> /etc/shadow world-readable, people would laugh.  If I suggested that they
> also post it on their public webserver, I would be laughed out of town.
> 
> Blocking _users is probably good enough for now. I'd be more comfortable if
> _readers didn't fail-open.  I'm also concerned that newcomers may not be
> impressed to find couchdb so "insecure" in its default state.
> 
> However, if _users could be blocked, but there were a restricted API for
> manipulating it (something like _update and _show handlers, allowing users
> only to see and change their own records), that would be much better IMO.
> 
> Regards,
> 
> Brian.
> 


RE: DB ACLs (was Re: 0.11 Release / Feature Freeze for 1.0)

Posted by James Hayton <th...@purplebulldog.com>.
Hi Everyone-

I am just an end user of couch right now, but the development of these
security features are important to me so I thought I would share my
thoughts.  

In general and specifically regarding points 5 and 9, I have to agree very
passionately with Brian.  There is no way that I want my users having access
to the email addresses, usernames, password hashes, etc... of any other
users on my system.  That would be a very bad thing and potentially even
derail some sort of use cases I had thought of for couch in the future.   I
would also prefer couch to be closed by default and open as needed, but this
doesn't matter too much as long as I can close it down.  

I also like the idea of restricting _all_dbs to return only the databases
that the user has access to.  I would accept it as an admin only resource if
that’s what everyone thought was best or the only option, but it seems crazy
to allow users to see the names of databases that they don’t have access to.
There is no point imho.  They don't have access for a reason.  

Lastly I would have to say that I would much rather have 0.11 and the
feature freeze for 1.0 held off for weeks or even months if required to get
this security stuff worked out properly.  It's incredibly important before
very many real world applications with sensitive data can actually be built.
I really believe couch is an amazing piece of software that could create so
many new opportunities by utilizing it's unique features such as replication
and hosted applications.  But, the true potential of couch really depends on
rock solid authorization capabilities.  Without that, I don’t think couch is
truly ready for a 1.0 release.  Proper authorization probably isn't an easy
thing to solve and I understand the desire to punt on this stuff and just
get something working, but I just don't think it’s a good long term
strategy.  

I hope I have not come off as unappreciative or demanding.  I truly
appreciate all the work that you guys are doing regarding security and with
couch in general.  I also understand that my opinion probably carries very
little relative weight here since I am not a contributor, but I would really
plead with everyone not to rush a release or freeze the api without really
making sure that these issues have been thought through carefully, tested,
etc...  

Thanks, 

James Hayton





-----Original Message-----
From: Brian Candler [mailto:B.Candler@pobox.com] 
Sent: Wednesday, February 03, 2010 1:24 PM
To: dev@couchdb.apache.org
Subject: Re: DB ACLs (was Re: 0.11 Release / Feature Freeze for 1.0)

On Wed, Feb 03, 2010 at 09:21:10AM -0800, Chris Anderson wrote:
> Let me see if I can address some of these concerns.

Thank you for taking the time to reply in detail and to implement some of
the changes.

> > I believe that in its current form, _all_dbs simply won't scale to
millions
> > of databases on a box if you want to limit it to accessible dbs only.
> 
> This is an interesting one. _all_dbs won't scale indefinitely even
> before this patch, because it has no built-in pagination abilities.
> Enhancing this feature to look into each file and keep going to till
> it finds N that can be listed isn't hard to code. It will be a little
> more work to make _all_dbs respect startkey and endkey.

I agree that it's not hard to code. What I mean is that it won't scale if
the server has to open and read a million files on disk to find the two that
you have access to.

Making _all_dbs an admin-only resource as Jan proposed is brutal but
effective in protecting the server.  (Admins probably do want _all_dbs
paginated, but that's a separate issue).  Futon would of course have to be
changed so that non-admin users type in the name of the database they want
to access.

So far this hasn't answered the question: why not put the authorization in
the _users document instead? But I think we're getting to that :-)

> > (2) _readers is a single monolithic object. I believe that it won't
scale to
> > millions of users having access to the same database.
> 
> It's not meant to support this use case. If you have millions of users
> with the same access rights, give them a common role and give that
> role access to the database.

That doesn't scale either, because what couchdb calls "roles" are really
what I'd call "groups". That is, they are a system-wide collection of users.
They are only maintainable by a system-wide administrator.

What I'm thinking of is that database1 contains application1, with a
collection of users.  database2 contains application2, with another
collection of users, and so on.  These databases/applications are hosted on
the same server, but belong to third parties.

IMO it's an unrealistic expectation for the database1 owner to come along to
the system administrator and say:

1. I'm having problems with scaling my _readers.
2. Please create a new role I can use, and apply this to all my existing
   readers.
3. More importantly, every time I need to add a new user to my application,
   I will come back to you and ask you to add this role that user.

Then for the database2 owner to come along and ask for the same.

That is, because roles are *system-wide* not *database-wide* then the
management of them doesn't scale if you want to use them for database-level
access controls.

Given the above: as system administrator you could decide to create roles
like "database1:_reader" to simplify administration and avoid role name
clashes.  You could even arrange the validate_doc_update in the _users
database so that a delegated person in database1 is able to add and remove
database1:* roles without having to trouble the systemwide admin.

But that's exactly what my proposal was. In which case, why can we not just
use this mechanism in the first place?

> > (3) _readers has no concurrency control. One admin making an ACL change
in
> > futon (say) will silently overwrite changes made around the same time by
> > another admin. This will get worse the more frequently users are added
and
> > removed.
> 
> _readers / _admins / _security are stored as a raw object without
> concurrency control, because keeping them as a document adds too much
> performance overhead on each request. Concurrency control is a
> tradeoff we make here.

Sorry to be blunt, but do you have numbers to back that up?  This smells
very much of premature optimisation.

In any case: if db:_reader and db:_admin are just roles, you have them in
the userctx object already.  That's clearly *more* efficient than having
them separately in the database.

_security is an edge case. I consider it as an adjunct to the design doc. 
You could, after all, hardcode

    var security = { .... };

in the top of your validate_doc_update; it just avoids you having to touch
your design doc so often.  Since there's only a single _security document
it's going to end up cached anyway.

> The database-specfic roles and names don't belong in the users db. The
> users db is for answering the question: "who is the user and what
> roles do they have". The ACLs say which names and roles can read or
> admin a given database.
> 
> It's a fact of life that users can rsync db-files around. If the names
> / roles are in the users db, they get wrong when databases are moved
> to another host or renamed on the current host.

The last sentence I agree with. The same is true if you delete a DB and
recreate one with the same name.

However, database uuids were proposed recently. If the _users doc authorized
against uuids rather than database names, would that issue be solved?
(The ability to have a per-user _all_dbs view would be lost, if there wasn't
a fast way to map a uuid back to a database name, but we've already decided
we can live without that)

> 4 is fixed.

Thanks. It didn't even add any data privacy, since an _admin could always
add themselves as a _reader anyway.

> > (5) Non-admin readers can view the entire _readers, _admins and
_security
> > resources.  I think this is quite a severe privacy concern, but it is
easily
> > fixed.
> 
> They can also read the design document. I'm not sure why this is a
> privacy concern. A user may need to contact a db admin for help with
> something, it's handy to be able to get a list of them. And it only
> makes sense that you should be able to see the list of users who can
> also access the same db you can.
> 
> If there's consensus that this is indeed an issue, it's not a hard
> thing to change in the code.

I await what others say. However I would certainly *not* want the internal
E-mail addresses of my admins being available to the whole world.  And as an
end-user of a facebook-style application, I would not want my E-mail address
known to every other user anywhere on that database.

For comparison: if you're granted SELECT, INSERT, UPDATE and DELETE
privileges as an Oracle user, that does not mean you get to find out the
usernames of the admins, or indeed any other users with rights to the same
database.

And for comparison: if someone signs up for the dev@apache.org mailing list,
they do not get to see the E-mail addresses of all the other members of this
list, and nor should I.

(I think the latter comparison is fair; a couchapp BBS would be a very sweet
thing to have)

> > (9) The _users db itself is world-readable (showing not only who your
users
> > are, but their password hashes). Highly undesirable.
> 
> I actually consider this a feature. We'd like to get some stronger
> password hashing (see the bcrypt threads) which should help with the
> password parts.

At the end of the day, bcrypt is still a hash of a password. Any password
hash is open to off-line brute-force attack.  You can tune the cost with
bcrypt, but dictionary attacks are still going to succeed for 90% of users. 
You may be running couchdb on a modest server but your attacker is thousands
of times more powerful than you, and can spend years doing it if they want.

Put it another way: if I suggested that people should start making
/etc/shadow world-readable, people would laugh.  If I suggested that they
also post it on their public webserver, I would be laughed out of town.

Blocking _users is probably good enough for now. I'd be more comfortable if
_readers didn't fail-open.  I'm also concerned that newcomers may not be
impressed to find couchdb so "insecure" in its default state.

However, if _users could be blocked, but there were a restricted API for
manipulating it (something like _update and _show handlers, allowing users
only to see and change their own records), that would be much better IMO.

Regards,

Brian.


Re: DB ACLs (was Re: 0.11 Release / Feature Freeze for 1.0)

Posted by Brian Candler <B....@pobox.com>.
One last thing on _reader behaviour.

If you try to access a database as a non-admin user, but don't have _reader
rights, I think you should get a 404 back which is indistinguisable from
"database does not exist".  Otherwise, you have an obvious way to probe for
database names, and if databases are named after customers, this is
information leak.

As a non-admin you never *need* to know whether it exists or not, since you
wouldn't have rights to create it anyway.

Regards,

Brian.

Re: DB ACLs (was Re: 0.11 Release / Feature Freeze for 1.0)

Posted by Chris Anderson <jc...@apache.org>.
On Wed, Feb 3, 2010 at 1:35 PM, Brian Candler <B....@pobox.com> wrote:
> On Wed, Feb 03, 2010 at 09:24:26PM +0000, Brian Candler wrote:
>> > > (9) The _users db itself is world-readable (showing not only who your users
>> > > are, but their password hashes). Highly undesirable.
>> >
>> > I actually consider this a feature. We'd like to get some stronger
>> > password hashing (see the bcrypt threads) which should help with the
>> > password parts.
>
> Actually, passwords aren't even the issue. Just revealing the *usernames* of
> all the users on the system is the problem.
>
> For example, if I were a competitor to couch.io, I would be very happy to
> download a list of customers I should be poaching :-)

In couch.io each customer gets an entire couchdb server, so no worries
about that.

Chris



-- 
Chris Anderson
http://jchrisa.net
http://couch.io

Re: DB ACLs (was Re: 0.11 Release / Feature Freeze for 1.0)

Posted by Brian Candler <B....@pobox.com>.
On Wed, Feb 03, 2010 at 09:24:26PM +0000, Brian Candler wrote:
> > > (9) The _users db itself is world-readable (showing not only who your users
> > > are, but their password hashes). Highly undesirable.
> > 
> > I actually consider this a feature. We'd like to get some stronger
> > password hashing (see the bcrypt threads) which should help with the
> > password parts.

Actually, passwords aren't even the issue. Just revealing the *usernames* of
all the users on the system is the problem.

For example, if I were a competitor to couch.io, I would be very happy to
download a list of customers I should be poaching :-)

Regards,

Brian.

Re: DB ACLs (was Re: 0.11 Release / Feature Freeze for 1.0)

Posted by Brian Candler <B....@pobox.com>.
On Wed, Feb 03, 2010 at 02:27:56PM -0800, Jan Lehnardt wrote:
> > Sorry to be blunt, but do you have numbers to back that up?  This smells
> > very much of premature optimisation.
> 
> Reading a document costs: 
> 
>     1 disk seek to the end of the db file to grab the b-tree root
>  + n disk seeks to the document (n < 5 to 10 in most cases)
> 
> In a well cached database, most of the seeks are free, but the last one
> usually isn't (unless you can hold a full database in memory).

That's not true here. If _readers were a document within the database then
the *same* document would be read for every access, and therefore it would
be cached.  There's no seek.

But again you've ignored my main point: I'm proposing that the authorisation
be a role within the userctx, and the userctx is already loaded.  Free is
better than cheap.

Actually, that's not entirely true. We want to have publicly-readable
databases, so we need an "_anon" user with roles assigned to it, and the
effective roles you get would be the union of the roles from your user
record plus the roles from the _anon user.

That might mean reading the _anon record from the _users database.  That
will always be the same record, so is bound to be cached.

Regards,

Brian.

Re: DB ACLs (was Re: 0.11 Release / Feature Freeze for 1.0)

Posted by Jan Lehnardt <ja...@apache.org>.
On 3 Feb 2010, at 13:24, Brian Candler wrote:

>> _readers / _admins / _security are stored as a raw object without
>> concurrency control, because keeping them as a document adds too much
>> performance overhead on each request. Concurrency control is a
>> tradeoff we make here.
> 
> Sorry to be blunt, but do you have numbers to back that up?  This smells
> very much of premature optimisation.

Reading a document costs: 

    1 disk seek to the end of the db file to grab the b-tree root
 + n disk seeks to the document (n < 5 to 10 in most cases)

In a well cached database, most of the seeks are free, but the last one
usually isn't (unless you can hold a full database in memory).

Doing a disk seek per request is going to kill any sort of performance
we have.

Reading from the process state is effectively free.

This is why stats don't go into a database or my log-to-db branch
is not in trunk.

Cheers
Jan
--


Re: DB ACLs (was Re: 0.11 Release / Feature Freeze for 1.0)

Posted by Brian Candler <B....@pobox.com>.
On Wed, Feb 03, 2010 at 09:21:10AM -0800, Chris Anderson wrote:
> Let me see if I can address some of these concerns.

Thank you for taking the time to reply in detail and to implement some of
the changes.

> > I believe that in its current form, _all_dbs simply won't scale to millions
> > of databases on a box if you want to limit it to accessible dbs only.
> 
> This is an interesting one. _all_dbs won't scale indefinitely even
> before this patch, because it has no built-in pagination abilities.
> Enhancing this feature to look into each file and keep going to till
> it finds N that can be listed isn't hard to code. It will be a little
> more work to make _all_dbs respect startkey and endkey.

I agree that it's not hard to code. What I mean is that it won't scale if
the server has to open and read a million files on disk to find the two that
you have access to.

Making _all_dbs an admin-only resource as Jan proposed is brutal but
effective in protecting the server.  (Admins probably do want _all_dbs
paginated, but that's a separate issue).  Futon would of course have to be
changed so that non-admin users type in the name of the database they want
to access.

So far this hasn't answered the question: why not put the authorization in
the _users document instead? But I think we're getting to that :-)

> > (2) _readers is a single monolithic object. I believe that it won't scale to
> > millions of users having access to the same database.
> 
> It's not meant to support this use case. If you have millions of users
> with the same access rights, give them a common role and give that
> role access to the database.

That doesn't scale either, because what couchdb calls "roles" are really
what I'd call "groups". That is, they are a system-wide collection of users.
They are only maintainable by a system-wide administrator.

What I'm thinking of is that database1 contains application1, with a
collection of users.  database2 contains application2, with another
collection of users, and so on.  These databases/applications are hosted on
the same server, but belong to third parties.

IMO it's an unrealistic expectation for the database1 owner to come along to
the system administrator and say:

1. I'm having problems with scaling my _readers.
2. Please create a new role I can use, and apply this to all my existing
   readers.
3. More importantly, every time I need to add a new user to my application,
   I will come back to you and ask you to add this role that user.

Then for the database2 owner to come along and ask for the same.

That is, because roles are *system-wide* not *database-wide* then the
management of them doesn't scale if you want to use them for database-level
access controls.

Given the above: as system administrator you could decide to create roles
like "database1:_reader" to simplify administration and avoid role name
clashes.  You could even arrange the validate_doc_update in the _users
database so that a delegated person in database1 is able to add and remove
database1:* roles without having to trouble the systemwide admin.

But that's exactly what my proposal was. In which case, why can we not just
use this mechanism in the first place?

> > (3) _readers has no concurrency control. One admin making an ACL change in
> > futon (say) will silently overwrite changes made around the same time by
> > another admin. This will get worse the more frequently users are added and
> > removed.
> 
> _readers / _admins / _security are stored as a raw object without
> concurrency control, because keeping them as a document adds too much
> performance overhead on each request. Concurrency control is a
> tradeoff we make here.

Sorry to be blunt, but do you have numbers to back that up?  This smells
very much of premature optimisation.

In any case: if db:_reader and db:_admin are just roles, you have them in
the userctx object already.  That's clearly *more* efficient than having
them separately in the database.

_security is an edge case. I consider it as an adjunct to the design doc. 
You could, after all, hardcode

    var security = { .... };

in the top of your validate_doc_update; it just avoids you having to touch
your design doc so often.  Since there's only a single _security document
it's going to end up cached anyway.

> The database-specfic roles and names don't belong in the users db. The
> users db is for answering the question: "who is the user and what
> roles do they have". The ACLs say which names and roles can read or
> admin a given database.
> 
> It's a fact of life that users can rsync db-files around. If the names
> / roles are in the users db, they get wrong when databases are moved
> to another host or renamed on the current host.

The last sentence I agree with. The same is true if you delete a DB and
recreate one with the same name.

However, database uuids were proposed recently. If the _users doc authorized
against uuids rather than database names, would that issue be solved?
(The ability to have a per-user _all_dbs view would be lost, if there wasn't
a fast way to map a uuid back to a database name, but we've already decided
we can live without that)

> 4 is fixed.

Thanks. It didn't even add any data privacy, since an _admin could always
add themselves as a _reader anyway.

> > (5) Non-admin readers can view the entire _readers, _admins and _security
> > resources.  I think this is quite a severe privacy concern, but it is easily
> > fixed.
> 
> They can also read the design document. I'm not sure why this is a
> privacy concern. A user may need to contact a db admin for help with
> something, it's handy to be able to get a list of them. And it only
> makes sense that you should be able to see the list of users who can
> also access the same db you can.
> 
> If there's consensus that this is indeed an issue, it's not a hard
> thing to change in the code.

I await what others say. However I would certainly *not* want the internal
E-mail addresses of my admins being available to the whole world.  And as an
end-user of a facebook-style application, I would not want my E-mail address
known to every other user anywhere on that database.

For comparison: if you're granted SELECT, INSERT, UPDATE and DELETE
privileges as an Oracle user, that does not mean you get to find out the
usernames of the admins, or indeed any other users with rights to the same
database.

And for comparison: if someone signs up for the dev@apache.org mailing list,
they do not get to see the E-mail addresses of all the other members of this
list, and nor should I.

(I think the latter comparison is fair; a couchapp BBS would be a very sweet
thing to have)

> > (9) The _users db itself is world-readable (showing not only who your users
> > are, but their password hashes). Highly undesirable.
> 
> I actually consider this a feature. We'd like to get some stronger
> password hashing (see the bcrypt threads) which should help with the
> password parts.

At the end of the day, bcrypt is still a hash of a password. Any password
hash is open to off-line brute-force attack.  You can tune the cost with
bcrypt, but dictionary attacks are still going to succeed for 90% of users. 
You may be running couchdb on a modest server but your attacker is thousands
of times more powerful than you, and can spend years doing it if they want.

Put it another way: if I suggested that people should start making
/etc/shadow world-readable, people would laugh.  If I suggested that they
also post it on their public webserver, I would be laughed out of town.

Blocking _users is probably good enough for now. I'd be more comfortable if
_readers didn't fail-open.  I'm also concerned that newcomers may not be
impressed to find couchdb so "insecure" in its default state.

However, if _users could be blocked, but there were a restricted API for
manipulating it (something like _update and _show handlers, allowing users
only to see and change their own records), that would be much better IMO.

Regards,

Brian.

Re: DB ACLs (was Re: 0.11 Release / Feature Freeze for 1.0)

Posted by Jan Lehnardt <ja...@apache.org>.
On 3 Feb 2010, at 09:21, Chris Anderson wrote:
>> I do see logic in keeping the admin/reader authorizations for a database
>> within the database itself. The problems are:
>> 
>> (1) _all_dbs currently shows everything - even those databases you don't
>> have access to.
>> 
>> I believe that in its current form, _all_dbs simply won't scale to millions
>> of databases on a box if you want to limit it to accessible dbs only.
>> 
> 
> This is an interesting one. _all_dbs won't scale indefinitely even
> before this patch, because it has no built-in pagination abilities.
> Enhancing this feature to look into each file and keep going to till
> it finds N that can be listed isn't hard to code. It will be a little
> more work to make _all_dbs respect startkey and endkey.
> 
> I'd be thrilled if someone would patch _all_dbs to act correctly.

I think I'm leaning towards making _all_dbs a admin-only resource.

Brian, thanks for taking the time to criticise Chris's patch and thanks
Chris for the thoughtful reply :)

Cheers
Jan
--



> 
>> (2) _readers is a single monolithic object. I believe that it won't scale to
>> millions of users having access to the same database.
> 
> It's not meant to support this use case. If you have millions of users
> with the same access rights, give them a common role and give that
> role access to the database.
> 
>> 
>> (3) _readers has no concurrency control. One admin making an ACL change in
>> futon (say) will silently overwrite changes made around the same time by
>> another admin. This will get worse the more frequently users are added and
>> removed.
> 
> _readers / _admins / _security are stored as a raw object without
> concurrency control, because keeping them as a document adds too much
> performance overhead on each request. Concurrency control is a
> tradeoff we make here.
> 
>> 
>> For me, those are serious problems. I sketched a design for an alternative
>> approach, using the _user db to hold the authorizations in terms of
>> database-specific roles.  Unfortunately I didn't have time to contribute an
>> implementation of this.  If there's a chance this alternative approach would
>> be used then I will try to steal the time from somewhere.  The ideas behind
>> it weren't explicitly rejected, but neither were they acknowledged as a good
>> approach.
> 
> The database-specfic roles and names don't belong in the users db. The
> users db is for answering the question: "who is the user and what
> roles do they have". The ACLs say which names and roles can read or
> admin a given database.
> 
> It's a fact of life that users can rsync db-files around. If the names
> / roles are in the users db, they get wrong when databases are moved
> to another host or renamed on the current host.
> 
> 
>> 
>> (4) An "admin" is not a "reader", and this is clearly intentional from
>> comments in the code. However, someone who has an "admin" role without
>> "reader" role is unable to perform ACL changes, which for me defeats the
>> whole purpose of the "admin" role.
>> 
>> 
>> So I'd propose that the relaxed approach is that a database "admin" should
>> inherit "reader" rights. Isn't that true for a server-level admin anyway?
> 
> I considered this, but dropped it because it seemed like it'd send the
> wrong message (if you put yourself as the only db-admin, this doesn't
> make the db private, to do that you need to edit the readers.)
> 
> Looking at your shell transcript makes me think it'd make life easier
> if we grant db-admins the reader privilege as well (so long as it
> remains clear that dbs with admins but no readers are public).
> 
> 4 is fixed.
> 
>> 
>> (5) Non-admin readers can view the entire _readers, _admins and _security
>> resources.  I think this is quite a severe privacy concern, but it is easily
>> fixed.
> 
> They can also read the design document. I'm not sure why this is a
> privacy concern. A user may need to contact a db admin for help with
> something, it's handy to be able to get a list of them. And it only
> makes sense that you should be able to see the list of users who can
> also access the same db you can.
> 
> If there's consensus that this is indeed an issue, it's not a hard
> thing to change in the code.
> 
>> 
>> (6) Databases are created world-readable by default, which means a race to
>> get the _readers set before someone else starts inserting documents. I think
>> a PUT /dbname option to set a non-empty readers list would be a good idea
>> (and a corresponding checkbox in futon)
> 
> Yes, I'd like to add some UI to the create-database form in Futon that
> immediately sets a reader == the current user. The UI would be a
> checkbox that says "make database private" or something. Making this
> synchronous with db-creation would be a neat trick, too.
> 
>> 
>> (7) Couchdb accepts nonsense _readers documents, e.g.
>> 
>> $ curl -X PUT -d '{"names":{"foo":"bar"},"roles":456}'
>> http://admin:admin@127.0.0.1:5984/briantest/_readers
>> {"ok":true}
>> 
>> The effect is to reset the _readers document to its permit-all default, thus
>> opening up the database to the world.
>> 
>> $ curl http://127.0.0.1:5984/briantest/_readers
>> {"names":[],"roles":[]}
> 
> My main concern in coding this was validating that malformed data
> can't be stored. I've fixed this to throw errors in the case of
> malformed data (instead of ignoring it).
> 
>> 
>> (8) Point (7) is arguably a simple bug which can be fixed, but I'd prefer
>> for couchdb to be fail-safe; that is, an empty ACL means nobody has access.
> 
> The goal with this patch was to add some easy ACL functionality
> without changing the default behavior. People who are happy to use
> Couch in admin-party mode shouldn't see any changes.
> 
>> 
>> One way to achieve this would be for two new roles, "_anon" and "_user",
>> granted to all unauthenticated and authenticated users respectively. Then a
>> fully public database would have roles:["_anon","_user"], and this would
>> be added to a new database unless you ask otherwise (see point (6)).
> 
> This ["_anon","_user"] business seems fine to me. It's a lot more
> complexity for basically the same capabilities (although I do like the
> ability to lock out anon users on a per-db basis, that's pretty cool.)
> 
> If anyone writes this, I'd be happy to see it. This won't be a small
> patch because it's got to introduce those roles as well as use them in
> the reader checks.
> 
>> 
>> (9) The _users db itself is world-readable (showing not only who your users
>> are, but their password hashes). Highly undesirable.
> 
> I actually consider this a feature. We'd like to get some stronger
> password hashing (see the bcrypt threads) which should help with the
> password parts.
> 
>> 
>> You can set a _readers ACL on it, but it has consequences:
>> * users can't sign up for new accounts
>> * users can't change their own passwords
>> This forces such things through a privileged external interface. Actually
>> I'm fine with that, because I want to validate signups anyway, but others
>> might not be.
>> 
>> (10) I don't think you can replicate _readers, _admins and _security, unless
>> (a) you are doing rsync filesystem-level replication, or (b) you explicitly
>> GET and PUT these resources from one DB to another. This is arguably a
>> feature not a bug.
>> 
>> (11) Trivial problem of the day: _security resources which are not objects
>> give an erlang-flavoured error.
>> 
>> $ curl -X PUT -d '["foo","bar"]' http://brianadmin:brianadmin@127.0.0.1:5984/briantest/_security
>> {"error":"unknown_error","reason":"function_clause"}
>> 
>> Sorry for the long post, and if I really am barking up the wrong tree here,
>> please tell me so.
> 
> Thanks for pointing out a lot of small bugs. The ones I think have merit are:
> 
> 1, 4, 6, 7, 8, 11
> 
> Of these I've fixed 4, 7 and 11 this morning.
> 
> If you can create Jira tickets for 1, 6, and 8 that'd be nice.
> 
> Chris
> 
>> 
>> Regards,
>> 
>> Brian.
>> 
> 
> 
> 
> -- 
> Chris Anderson
> http://jchrisa.net
> http://couch.io