You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@couchdb.apache.org by Adam Kocoloski <ko...@apache.org> on 2011/08/16 16:26:04 UTC

The replicator needs a superuser mode

One of the principal uses of the replicator is to "make this database look like that one".  We're unable to do that in the general case today because of the combination of validation functions and out-of-order document transfers.  It's entirely possible for a document to be saved in the source DB prior to the installation of a ddoc containing a validation function that would have rejected the document, for the replicator to install the ddoc in the target DB before replicating the other document, and for the other document to then be rejected by the target DB.

I propose we add a role which allows a user to bypass validation, or else extend that privilege to the _admin role.  We should still validate updates by default and add a way (a new qs param, for instance) to indicate that validation should be skipped for a particular update.  Thoughts?

Adam

Re: The replicator needs a superuser mode

Posted by Jason Smith <jh...@iriscouch.com>.

On Tue, Aug 16, 2011 at 10:24 PM, Jan Lehnardt <ja...@apache.org> wrote:
> This is only slightly related, but I'm dreaming of /db/_dump and /db/_restore endpoints

Jan, I also had that dream at CouchOne, but now I think it is a very bad idea.

A database is a URL. Every URL is different. Cloning URL_A to URL_B is
tempting, but fundamentally anti-CouchDB.

There is a reason the security object does not replicate. Every URL
(or origin) is a different security environment, and it is meaningless
or wrong to apply A's security object to B's database.

Validation functions decide what to allow based on userCtx and secObj.
Both of those change (generally) with the URL. Cloning one database to
another IMO spits in the face of the architecture and philosophy of
replication.

IMHO, cloning a *database* is not desirable. Long-term, you really
want to replicate a database.

Cloning a *couch* (GET /_dump, PUT /_restore) would be awesome. That
is the right abstraction level. Among other reasons, it can include
the config. Maybe that is mission creep.

-- 
Iris Couch

Re: The replicator needs a superuser mode

Posted by Ryan Ramage <ry...@gmail.com>.

> This is only slightly related, but I'm dreaming of /db/_dump and /db/_restore endpoints (the names don't matter, could be one with GET / PUT) that just ships verbatim .couch files over HTTP. It would be for admins only, it would not be incremental (although we might be able to add that), and I haven't yet thought through all the concurrency and error case implications, the above solves more than the proposed problem and in a very different, but I thought I throw it in the mix.
>

+1 on /db/_dump and /db/_restore endpoints!! Very beneficial to us
little people trying to make installers like couchapp-takeout, and
could even be used from futon to create a database from a remote db. I
am anecdotally noticing that using replication to create a local
database from a remote one with lots of attachments takes a long time,
is prone to timeouts, and gets stuck (been working with jhs on this).
Dump/restore will be also much faster, eliminating the small requests.

Re: The replicator needs a superuser mode

Posted by Adam Kocoloski <ko...@apache.org>.

Hmm, if we used a separate role we'd need a multi-step process to trigger the replication

1) create the database
2) have an admin grant the _skip_validation role on that DB to the replicator's user_ctx
3) trigger the replication

Kind of annoying.  Certainly would be simpler to allow _admins to do this if just by adding a skip_validation=true parameter to write requests.

Adam

On Aug 16, 2011, at 2:21 PM, Robert Newson wrote:

> no objection to special role. As in my opening statement, would be
> concerned about adding it to _admin without devoting more thought to
> possible unintended consequences.
> 
> b.
> 
> On 16 August 2011 19:13, Robert Dionne <di...@dionne-associates.com> wrote:
>> No objection, just the question of why the need for a new role, why not use admin?
>> 
>> 
>> 
>> On Aug 16, 2011, at 2:10 PM, Adam Kocoloski wrote:
>> 
>>> Wow, this thread got hijacked a bit :)  Anyone object to the special role that has the "skip validation" superpower?
>>> 
>>> Adam
>>> 
>>> On Aug 16, 2011, at 1:51 PM, Jan Lehnardt wrote:
>>> 
>>>> Both rsync an scp won't allow me to do curl http://couch/db/_dump | curl http://couch/db/_restore.
>>>> 
>>>> I acknowledge that similar solutions exist, but using the http transport allows for more fun things down the road.
>>>> 
>>>> See what we are doing with _changes today where DbUpdateNotifications nearly do the same thing.
>>>> 
>>>> Cheers
>>>> Jan
>>>> --
>>>> 
>>>> On 16.08.2011, at 19:13, Nathan Vander Wilt <na...@calftrail.com> wrote:
>>>> 
>>>>> We've already got replication, _all_docs and some really robust on-disk consistency properties. For shuttling raw database files between servers, wouldn't rsync be more efficient (and fit better within existing sysadmin security/deployment structures)?
>>>>> -nvw
>>>>> 
>>>>> 
>>>>> On Aug 16, 2011, at 9:55 AM, Paul Davis wrote:
>>>>>> Me and Adam were just mulling over a similar endpoint the other night
>>>>>> that could be used to generate plain-text backups similar to what
>>>>>> couchdb-dump and couchdb-load were doing. With the idea that there
>>>>>> would be some special sauce to pipe from one _dump endpoint directly
>>>>>> into a different _load handler. Obvious downfall was incremental-ness
>>>>>> of this. Seems like it'd be doable, but I'm not entirely certain on
>>>>>> the best method.
>>>>>> 
>>>>>> I was also considering this as our full-proof 100% reliable method for
>>>>>> migrating data between different CouchDB versions which we seem to
>>>>>> screw up fairly regularly.
>>>>>> 
>>>>>> +1 on the idea. Not sure about raw couch files as it limits the wider
>>>>>> usefulness (and we already have scp).
>>>>>> 
>>>>>> On Tue, Aug 16, 2011 at 10:24 AM, Jan Lehnardt <ja...@apache.org> wrote:
>>>>>>> This is only slightly related, but I'm dreaming of /db/_dump and /db/_restore endpoints (the names don't matter, could be one with GET / PUT) that just ships verbatim .couch files over HTTP. It would be for admins only, it would not be incremental (although we might be able to add that), and I haven't yet thought through all the concurrency and error case implications, the above solves more than the proposed problem and in a very different, but I thought I throw it in the mix.
>>>>>>> 
>>>>>>> Cheers
>>>>>>> Jan
>>>>>>> --
>>>>>>> 
>>>>>>> On Aug 16, 2011, at 5:08 PM, Robert Newson wrote:
>>>>>>> 
>>>>>>>> +1 on the intention but we'll need to be careful. The use case is
>>>>>>>> specifically to allow verbatim migration of databases between servers.
>>>>>>>> A separate role makes sense as I'm not sure of the consequences of
>>>>>>>> explicitly granting this ability to the existing _admin role.
>>>>>>>> 
>>>>>>>> B.
>>>>>>>> 
>>>>>>>> On 16 August 2011 15:26, Adam Kocoloski <ko...@apache.org> wrote:
>>>>>>>>> One of the principal uses of the replicator is to "make this database look like that one".  We're unable to do that in the general case today because of the combination of validation functions and out-of-order document transfers.  It's entirely possible for a document to be saved in the source DB prior to the installation of a ddoc containing a validation function that would have rejected the document, for the replicator to install the ddoc in the target DB before replicating the other document, and for the other document to then be rejected by the target DB.
>>>>>>>>> 
>>>>>>>>> I propose we add a role which allows a user to bypass validation, or else extend that privilege to the _admin role.  We should still validate updates by default and add a way (a new qs param, for instance) to indicate that validation should be skipped for a particular update.  Thoughts?
>>>>>>>>> 
>>>>>>>>> Adam
>>>>>>> 
>>>>>>> 
>>>>> 
>>> 
>> 
>>

Re: The replicator needs a superuser mode

Posted by Robert Newson <rn...@apache.org>.

no objection to special role. As in my opening statement, would be
concerned about adding it to _admin without devoting more thought to
possible unintended consequences.

b.

On 16 August 2011 19:13, Robert Dionne <di...@dionne-associates.com> wrote:
> No objection, just the question of why the need for a new role, why not use admin?
>
>
>
> On Aug 16, 2011, at 2:10 PM, Adam Kocoloski wrote:
>
>> Wow, this thread got hijacked a bit :)  Anyone object to the special role that has the "skip validation" superpower?
>>
>> Adam
>>
>> On Aug 16, 2011, at 1:51 PM, Jan Lehnardt wrote:
>>
>>> Both rsync an scp won't allow me to do curl http://couch/db/_dump | curl http://couch/db/_restore.
>>>
>>> I acknowledge that similar solutions exist, but using the http transport allows for more fun things down the road.
>>>
>>> See what we are doing with _changes today where DbUpdateNotifications nearly do the same thing.
>>>
>>> Cheers
>>> Jan
>>> --
>>>
>>> On 16.08.2011, at 19:13, Nathan Vander Wilt <na...@calftrail.com> wrote:
>>>
>>>> We've already got replication, _all_docs and some really robust on-disk consistency properties. For shuttling raw database files between servers, wouldn't rsync be more efficient (and fit better within existing sysadmin security/deployment structures)?
>>>> -nvw
>>>>
>>>>
>>>> On Aug 16, 2011, at 9:55 AM, Paul Davis wrote:
>>>>> Me and Adam were just mulling over a similar endpoint the other night
>>>>> that could be used to generate plain-text backups similar to what
>>>>> couchdb-dump and couchdb-load were doing. With the idea that there
>>>>> would be some special sauce to pipe from one _dump endpoint directly
>>>>> into a different _load handler. Obvious downfall was incremental-ness
>>>>> of this. Seems like it'd be doable, but I'm not entirely certain on
>>>>> the best method.
>>>>>
>>>>> I was also considering this as our full-proof 100% reliable method for
>>>>> migrating data between different CouchDB versions which we seem to
>>>>> screw up fairly regularly.
>>>>>
>>>>> +1 on the idea. Not sure about raw couch files as it limits the wider
>>>>> usefulness (and we already have scp).
>>>>>
>>>>> On Tue, Aug 16, 2011 at 10:24 AM, Jan Lehnardt <ja...@apache.org> wrote:
>>>>>> This is only slightly related, but I'm dreaming of /db/_dump and /db/_restore endpoints (the names don't matter, could be one with GET / PUT) that just ships verbatim .couch files over HTTP. It would be for admins only, it would not be incremental (although we might be able to add that), and I haven't yet thought through all the concurrency and error case implications, the above solves more than the proposed problem and in a very different, but I thought I throw it in the mix.
>>>>>>
>>>>>> Cheers
>>>>>> Jan
>>>>>> --
>>>>>>
>>>>>> On Aug 16, 2011, at 5:08 PM, Robert Newson wrote:
>>>>>>
>>>>>>> +1 on the intention but we'll need to be careful. The use case is
>>>>>>> specifically to allow verbatim migration of databases between servers.
>>>>>>> A separate role makes sense as I'm not sure of the consequences of
>>>>>>> explicitly granting this ability to the existing _admin role.
>>>>>>>
>>>>>>> B.
>>>>>>>
>>>>>>> On 16 August 2011 15:26, Adam Kocoloski <ko...@apache.org> wrote:
>>>>>>>> One of the principal uses of the replicator is to "make this database look like that one".  We're unable to do that in the general case today because of the combination of validation functions and out-of-order document transfers.  It's entirely possible for a document to be saved in the source DB prior to the installation of a ddoc containing a validation function that would have rejected the document, for the replicator to install the ddoc in the target DB before replicating the other document, and for the other document to then be rejected by the target DB.
>>>>>>>>
>>>>>>>> I propose we add a role which allows a user to bypass validation, or else extend that privilege to the _admin role.  We should still validate updates by default and add a way (a new qs param, for instance) to indicate that validation should be skipped for a particular update.  Thoughts?
>>>>>>>>
>>>>>>>> Adam
>>>>>>
>>>>>>
>>>>
>>
>
>

Re: The replicator needs a superuser mode

Posted by Robert Dionne <di...@dionne-associates.com>.

No objection, just the question of why the need for a new role, why not use admin?



On Aug 16, 2011, at 2:10 PM, Adam Kocoloski wrote:

> Wow, this thread got hijacked a bit :)  Anyone object to the special role that has the "skip validation" superpower?
> 
> Adam
> 
> On Aug 16, 2011, at 1:51 PM, Jan Lehnardt wrote:
> 
>> Both rsync an scp won't allow me to do curl http://couch/db/_dump | curl http://couch/db/_restore.
>> 
>> I acknowledge that similar solutions exist, but using the http transport allows for more fun things down the road.
>> 
>> See what we are doing with _changes today where DbUpdateNotifications nearly do the same thing.
>> 
>> Cheers
>> Jan
>> --
>> 
>> On 16.08.2011, at 19:13, Nathan Vander Wilt <na...@calftrail.com> wrote:
>> 
>>> We've already got replication, _all_docs and some really robust on-disk consistency properties. For shuttling raw database files between servers, wouldn't rsync be more efficient (and fit better within existing sysadmin security/deployment structures)?
>>> -nvw
>>> 
>>> 
>>> On Aug 16, 2011, at 9:55 AM, Paul Davis wrote:
>>>> Me and Adam were just mulling over a similar endpoint the other night
>>>> that could be used to generate plain-text backups similar to what
>>>> couchdb-dump and couchdb-load were doing. With the idea that there
>>>> would be some special sauce to pipe from one _dump endpoint directly
>>>> into a different _load handler. Obvious downfall was incremental-ness
>>>> of this. Seems like it'd be doable, but I'm not entirely certain on
>>>> the best method.
>>>> 
>>>> I was also considering this as our full-proof 100% reliable method for
>>>> migrating data between different CouchDB versions which we seem to
>>>> screw up fairly regularly.
>>>> 
>>>> +1 on the idea. Not sure about raw couch files as it limits the wider
>>>> usefulness (and we already have scp).
>>>> 
>>>> On Tue, Aug 16, 2011 at 10:24 AM, Jan Lehnardt <ja...@apache.org> wrote:
>>>>> This is only slightly related, but I'm dreaming of /db/_dump and /db/_restore endpoints (the names don't matter, could be one with GET / PUT) that just ships verbatim .couch files over HTTP. It would be for admins only, it would not be incremental (although we might be able to add that), and I haven't yet thought through all the concurrency and error case implications, the above solves more than the proposed problem and in a very different, but I thought I throw it in the mix.
>>>>> 
>>>>> Cheers
>>>>> Jan
>>>>> --
>>>>> 
>>>>> On Aug 16, 2011, at 5:08 PM, Robert Newson wrote:
>>>>> 
>>>>>> +1 on the intention but we'll need to be careful. The use case is
>>>>>> specifically to allow verbatim migration of databases between servers.
>>>>>> A separate role makes sense as I'm not sure of the consequences of
>>>>>> explicitly granting this ability to the existing _admin role.
>>>>>> 
>>>>>> B.
>>>>>> 
>>>>>> On 16 August 2011 15:26, Adam Kocoloski <ko...@apache.org> wrote:
>>>>>>> One of the principal uses of the replicator is to "make this database look like that one".  We're unable to do that in the general case today because of the combination of validation functions and out-of-order document transfers.  It's entirely possible for a document to be saved in the source DB prior to the installation of a ddoc containing a validation function that would have rejected the document, for the replicator to install the ddoc in the target DB before replicating the other document, and for the other document to then be rejected by the target DB.
>>>>>>> 
>>>>>>> I propose we add a role which allows a user to bypass validation, or else extend that privilege to the _admin role.  We should still validate updates by default and add a way (a new qs param, for instance) to indicate that validation should be skipped for a particular update.  Thoughts?
>>>>>>> 
>>>>>>> Adam
>>>>> 
>>>>> 
>>> 
>

Re: The replicator needs a superuser mode

Posted by Paul Davis <pa...@gmail.com>.

On Tue, Aug 16, 2011 at 1:10 PM, Adam Kocoloski <ko...@apache.org> wrote:
> Wow, this thread got hijacked a bit :)

You must be new here.

Re: The replicator needs a superuser mode

Posted by Adam Kocoloski <ko...@apache.org>.

Wow, this thread got hijacked a bit :)  Anyone object to the special role that has the "skip validation" superpower?

Adam

On Aug 16, 2011, at 1:51 PM, Jan Lehnardt wrote:

> Both rsync an scp won't allow me to do curl http://couch/db/_dump | curl http://couch/db/_restore.
> 
> I acknowledge that similar solutions exist, but using the http transport allows for more fun things down the road.
> 
> See what we are doing with _changes today where DbUpdateNotifications nearly do the same thing.
> 
> Cheers
> Jan
> --
> 
> On 16.08.2011, at 19:13, Nathan Vander Wilt <na...@calftrail.com> wrote:
> 
>> We've already got replication, _all_docs and some really robust on-disk consistency properties. For shuttling raw database files between servers, wouldn't rsync be more efficient (and fit better within existing sysadmin security/deployment structures)?
>> -nvw
>> 
>> 
>> On Aug 16, 2011, at 9:55 AM, Paul Davis wrote:
>>> Me and Adam were just mulling over a similar endpoint the other night
>>> that could be used to generate plain-text backups similar to what
>>> couchdb-dump and couchdb-load were doing. With the idea that there
>>> would be some special sauce to pipe from one _dump endpoint directly
>>> into a different _load handler. Obvious downfall was incremental-ness
>>> of this. Seems like it'd be doable, but I'm not entirely certain on
>>> the best method.
>>> 
>>> I was also considering this as our full-proof 100% reliable method for
>>> migrating data between different CouchDB versions which we seem to
>>> screw up fairly regularly.
>>> 
>>> +1 on the idea. Not sure about raw couch files as it limits the wider
>>> usefulness (and we already have scp).
>>> 
>>> On Tue, Aug 16, 2011 at 10:24 AM, Jan Lehnardt <ja...@apache.org> wrote:
>>>> This is only slightly related, but I'm dreaming of /db/_dump and /db/_restore endpoints (the names don't matter, could be one with GET / PUT) that just ships verbatim .couch files over HTTP. It would be for admins only, it would not be incremental (although we might be able to add that), and I haven't yet thought through all the concurrency and error case implications, the above solves more than the proposed problem and in a very different, but I thought I throw it in the mix.
>>>> 
>>>> Cheers
>>>> Jan
>>>> --
>>>> 
>>>> On Aug 16, 2011, at 5:08 PM, Robert Newson wrote:
>>>> 
>>>>> +1 on the intention but we'll need to be careful. The use case is
>>>>> specifically to allow verbatim migration of databases between servers.
>>>>> A separate role makes sense as I'm not sure of the consequences of
>>>>> explicitly granting this ability to the existing _admin role.
>>>>> 
>>>>> B.
>>>>> 
>>>>> On 16 August 2011 15:26, Adam Kocoloski <ko...@apache.org> wrote:
>>>>>> One of the principal uses of the replicator is to "make this database look like that one".  We're unable to do that in the general case today because of the combination of validation functions and out-of-order document transfers.  It's entirely possible for a document to be saved in the source DB prior to the installation of a ddoc containing a validation function that would have rejected the document, for the replicator to install the ddoc in the target DB before replicating the other document, and for the other document to then be rejected by the target DB.
>>>>>> 
>>>>>> I propose we add a role which allows a user to bypass validation, or else extend that privilege to the _admin role.  We should still validate updates by default and add a way (a new qs param, for instance) to indicate that validation should be skipped for a particular update.  Thoughts?
>>>>>> 
>>>>>> Adam
>>>> 
>>>> 
>>

Re: The replicator needs a superuser mode

Posted by Jan Lehnardt <ja...@apache.org>.

Both rsync an scp won't allow me to do curl http://couch/db/_dump | curl http://couch/db/_restore.

I acknowledge that similar solutions exist, but using the http transport allows for more fun things down the road.

See what we are doing with _changes today where DbUpdateNotifications nearly do the same thing.

Cheers
Jan
--

On 16.08.2011, at 19:13, Nathan Vander Wilt <na...@calftrail.com> wrote:

> We've already got replication, _all_docs and some really robust on-disk consistency properties. For shuttling raw database files between servers, wouldn't rsync be more efficient (and fit better within existing sysadmin security/deployment structures)?
> -nvw
> 
> 
> On Aug 16, 2011, at 9:55 AM, Paul Davis wrote:
>> Me and Adam were just mulling over a similar endpoint the other night
>> that could be used to generate plain-text backups similar to what
>> couchdb-dump and couchdb-load were doing. With the idea that there
>> would be some special sauce to pipe from one _dump endpoint directly
>> into a different _load handler. Obvious downfall was incremental-ness
>> of this. Seems like it'd be doable, but I'm not entirely certain on
>> the best method.
>> 
>> I was also considering this as our full-proof 100% reliable method for
>> migrating data between different CouchDB versions which we seem to
>> screw up fairly regularly.
>> 
>> +1 on the idea. Not sure about raw couch files as it limits the wider
>> usefulness (and we already have scp).
>> 
>> On Tue, Aug 16, 2011 at 10:24 AM, Jan Lehnardt <ja...@apache.org> wrote:
>>> This is only slightly related, but I'm dreaming of /db/_dump and /db/_restore endpoints (the names don't matter, could be one with GET / PUT) that just ships verbatim .couch files over HTTP. It would be for admins only, it would not be incremental (although we might be able to add that), and I haven't yet thought through all the concurrency and error case implications, the above solves more than the proposed problem and in a very different, but I thought I throw it in the mix.
>>> 
>>> Cheers
>>> Jan
>>> --
>>> 
>>> On Aug 16, 2011, at 5:08 PM, Robert Newson wrote:
>>> 
>>>> +1 on the intention but we'll need to be careful. The use case is
>>>> specifically to allow verbatim migration of databases between servers.
>>>> A separate role makes sense as I'm not sure of the consequences of
>>>> explicitly granting this ability to the existing _admin role.
>>>> 
>>>> B.
>>>> 
>>>> On 16 August 2011 15:26, Adam Kocoloski <ko...@apache.org> wrote:
>>>>> One of the principal uses of the replicator is to "make this database look like that one".  We're unable to do that in the general case today because of the combination of validation functions and out-of-order document transfers.  It's entirely possible for a document to be saved in the source DB prior to the installation of a ddoc containing a validation function that would have rejected the document, for the replicator to install the ddoc in the target DB before replicating the other document, and for the other document to then be rejected by the target DB.
>>>>> 
>>>>> I propose we add a role which allows a user to bypass validation, or else extend that privilege to the _admin role.  We should still validate updates by default and add a way (a new qs param, for instance) to indicate that validation should be skipped for a particular update.  Thoughts?
>>>>> 
>>>>> Adam
>>> 
>>> 
>

Re: The replicator needs a superuser mode

Posted by Nathan Vander Wilt <na...@calftrail.com>.

We've already got replication, _all_docs and some really robust on-disk consistency properties. For shuttling raw database files between servers, wouldn't rsync be more efficient (and fit better within existing sysadmin security/deployment structures)?
-nvw


On Aug 16, 2011, at 9:55 AM, Paul Davis wrote:
> Me and Adam were just mulling over a similar endpoint the other night
> that could be used to generate plain-text backups similar to what
> couchdb-dump and couchdb-load were doing. With the idea that there
> would be some special sauce to pipe from one _dump endpoint directly
> into a different _load handler. Obvious downfall was incremental-ness
> of this. Seems like it'd be doable, but I'm not entirely certain on
> the best method.
> 
> I was also considering this as our full-proof 100% reliable method for
> migrating data between different CouchDB versions which we seem to
> screw up fairly regularly.
> 
> +1 on the idea. Not sure about raw couch files as it limits the wider
> usefulness (and we already have scp).
> 
> On Tue, Aug 16, 2011 at 10:24 AM, Jan Lehnardt <ja...@apache.org> wrote:
>> This is only slightly related, but I'm dreaming of /db/_dump and /db/_restore endpoints (the names don't matter, could be one with GET / PUT) that just ships verbatim .couch files over HTTP. It would be for admins only, it would not be incremental (although we might be able to add that), and I haven't yet thought through all the concurrency and error case implications, the above solves more than the proposed problem and in a very different, but I thought I throw it in the mix.
>> 
>> Cheers
>> Jan
>> --
>> 
>> On Aug 16, 2011, at 5:08 PM, Robert Newson wrote:
>> 
>>> +1 on the intention but we'll need to be careful. The use case is
>>> specifically to allow verbatim migration of databases between servers.
>>> A separate role makes sense as I'm not sure of the consequences of
>>> explicitly granting this ability to the existing _admin role.
>>> 
>>> B.
>>> 
>>> On 16 August 2011 15:26, Adam Kocoloski <ko...@apache.org> wrote:
>>>> One of the principal uses of the replicator is to "make this database look like that one".  We're unable to do that in the general case today because of the combination of validation functions and out-of-order document transfers.  It's entirely possible for a document to be saved in the source DB prior to the installation of a ddoc containing a validation function that would have rejected the document, for the replicator to install the ddoc in the target DB before replicating the other document, and for the other document to then be rejected by the target DB.
>>>> 
>>>> I propose we add a role which allows a user to bypass validation, or else extend that privilege to the _admin role.  We should still validate updates by default and add a way (a new qs param, for instance) to indicate that validation should be skipped for a particular update.  Thoughts?
>>>> 
>>>> Adam
>> 
>>

Re: The replicator needs a superuser mode

Posted by Paul Davis <pa...@gmail.com>.

Me and Adam were just mulling over a similar endpoint the other night
that could be used to generate plain-text backups similar to what
couchdb-dump and couchdb-load were doing. With the idea that there
would be some special sauce to pipe from one _dump endpoint directly
into a different _load handler. Obvious downfall was incremental-ness
of this. Seems like it'd be doable, but I'm not entirely certain on
the best method.

I was also considering this as our full-proof 100% reliable method for
migrating data between different CouchDB versions which we seem to
screw up fairly regularly.

+1 on the idea. Not sure about raw couch files as it limits the wider
usefulness (and we already have scp).

On Tue, Aug 16, 2011 at 10:24 AM, Jan Lehnardt <ja...@apache.org> wrote:
> This is only slightly related, but I'm dreaming of /db/_dump and /db/_restore endpoints (the names don't matter, could be one with GET / PUT) that just ships verbatim .couch files over HTTP. It would be for admins only, it would not be incremental (although we might be able to add that), and I haven't yet thought through all the concurrency and error case implications, the above solves more than the proposed problem and in a very different, but I thought I throw it in the mix.
>
> Cheers
> Jan
> --
>
> On Aug 16, 2011, at 5:08 PM, Robert Newson wrote:
>
>> +1 on the intention but we'll need to be careful. The use case is
>> specifically to allow verbatim migration of databases between servers.
>> A separate role makes sense as I'm not sure of the consequences of
>> explicitly granting this ability to the existing _admin role.
>>
>> B.
>>
>> On 16 August 2011 15:26, Adam Kocoloski <ko...@apache.org> wrote:
>>> One of the principal uses of the replicator is to "make this database look like that one".  We're unable to do that in the general case today because of the combination of validation functions and out-of-order document transfers.  It's entirely possible for a document to be saved in the source DB prior to the installation of a ddoc containing a validation function that would have rejected the document, for the replicator to install the ddoc in the target DB before replicating the other document, and for the other document to then be rejected by the target DB.
>>>
>>> I propose we add a role which allows a user to bypass validation, or else extend that privilege to the _admin role.  We should still validate updates by default and add a way (a new qs param, for instance) to indicate that validation should be skipped for a particular update.  Thoughts?
>>>
>>> Adam
>
>

Re: The replicator needs a superuser mode

Posted by Jan Lehnardt <ja...@apache.org>.

This is only slightly related, but I'm dreaming of /db/_dump and /db/_restore endpoints (the names don't matter, could be one with GET / PUT) that just ships verbatim .couch files over HTTP. It would be for admins only, it would not be incremental (although we might be able to add that), and I haven't yet thought through all the concurrency and error case implications, the above solves more than the proposed problem and in a very different, but I thought I throw it in the mix.

Cheers
Jan
-- 

On Aug 16, 2011, at 5:08 PM, Robert Newson wrote:

> +1 on the intention but we'll need to be careful. The use case is
> specifically to allow verbatim migration of databases between servers.
> A separate role makes sense as I'm not sure of the consequences of
> explicitly granting this ability to the existing _admin role.
> 
> B.
> 
> On 16 August 2011 15:26, Adam Kocoloski <ko...@apache.org> wrote:
>> One of the principal uses of the replicator is to "make this database look like that one".  We're unable to do that in the general case today because of the combination of validation functions and out-of-order document transfers.  It's entirely possible for a document to be saved in the source DB prior to the installation of a ddoc containing a validation function that would have rejected the document, for the replicator to install the ddoc in the target DB before replicating the other document, and for the other document to then be rejected by the target DB.
>> 
>> I propose we add a role which allows a user to bypass validation, or else extend that privilege to the _admin role.  We should still validate updates by default and add a way (a new qs param, for instance) to indicate that validation should be skipped for a particular update.  Thoughts?
>> 
>> Adam

Re: The replicator needs a superuser mode

Posted by Robert Newson <rn...@apache.org>.

+1 on the intention but we'll need to be careful. The use case is
specifically to allow verbatim migration of databases between servers.
A separate role makes sense as I'm not sure of the consequences of
explicitly granting this ability to the existing _admin role.

B.

On 16 August 2011 15:26, Adam Kocoloski <ko...@apache.org> wrote:
> One of the principal uses of the replicator is to "make this database look like that one".  We're unable to do that in the general case today because of the combination of validation functions and out-of-order document transfers.  It's entirely possible for a document to be saved in the source DB prior to the installation of a ddoc containing a validation function that would have rejected the document, for the replicator to install the ddoc in the target DB before replicating the other document, and for the other document to then be rejected by the target DB.
>
> I propose we add a role which allows a user to bypass validation, or else extend that privilege to the _admin role.  We should still validate updates by default and add a way (a new qs param, for instance) to indicate that validation should be skipped for a particular update.  Thoughts?
>
> Adam

Re: The replicator needs a superuser mode

Posted by Jason Smith <jh...@iriscouch.com>.

On Wed, Aug 17, 2011 at 9:49 AM, Adam Kocoloski <ko...@apache.org> wrote:
> On Aug 16, 2011, at 10:31 PM, Jason Smith wrote:
>
>> On Tue, Aug 16, 2011 at 9:26 PM, Adam Kocoloski <ko...@apache.org> wrote:
>>> One of the principal uses of the replicator is to "make this database look like that one".  We're unable to do that in the general case today because of the combination of validation functions and out-of-order document transfers.  It's entirely possible for a document to be saved in the source DB prior to the installation of a ddoc containing a validation function that would have rejected the document, for the replicator to install the ddoc in the target DB before replicating the other document, and for the other document to then be rejected by the target DB.
>>
>> Somebody asked about this on Stack Overflow. It was a very simple but
>> challenging question, but now I can't find it. Basically, he made your
>> point above.
>>
>> Aren't you identifying two problems, though?
>>
>> 1. Sometimes you need to ignore validation to just make a nice, clean copy.
>> 2. Replication batches (an optimization) are disobeying the change
>> sequence, which can screw up the replica.
>
> As far as I know the only reason one needs to ignore validation to make a nice clean copy is because the replicator does not guarantee the updates are applied on the target in the order they were received on the source.  It's all one issue to me.
>
>> I responded to #1 already.
>>
>> But my feeling about #2 is that the optimization goes too far.
>> replication batches should always have boundaries immediately before
>> and after design documents. In other words, batch all you want, but
>> design documents [1] must always be in a batch size of 1. That will
>> retain the semantics.
>>
>> [1] Actually, the only ddocs needing their own private batches are
>> those with a validate_doc_update field.
>
> My standard retort to transaction boundaries is that there is no global ordering of events in a distributed system.  A clustered CouchDB can try to build a vector clock out of the change sequences of the individual servers and stick to that merged sequence during replication, but even then the ddoc entry in the feed could be "concurrent" with several other updates.  I rather like that the replicator aggressively mixes up the ordering of updates because it prevents us from making choices in the single-server case that aren't sensible in a cluster.

That is interesting. So if it is crucial that an application enforce
transaction semantics, then that application can go ahead and
understand the distribution architecture, and it can confirm that a
ddoc is committed and distributed among all nodes, and then it can
make subsequent changes or replications.

Or, written as a dialogue:

Developer: My application knows or cares that Couch is distributed.
Developer: My application depends on a validation function applying universally.
Developer. But my application won't bother to confirm that it's been
fully pushed before I make changes or replications.
Adam: WTF?

Snark aside, it's an excellent point. Thanks.

-- 
Iris Couch

Re: The replicator needs a superuser mode

Posted by Adam Kocoloski <ko...@apache.org>.

On Aug 16, 2011, at 10:31 PM, Jason Smith wrote:

> On Tue, Aug 16, 2011 at 9:26 PM, Adam Kocoloski <ko...@apache.org> wrote:
>> One of the principal uses of the replicator is to "make this database look like that one".  We're unable to do that in the general case today because of the combination of validation functions and out-of-order document transfers.  It's entirely possible for a document to be saved in the source DB prior to the installation of a ddoc containing a validation function that would have rejected the document, for the replicator to install the ddoc in the target DB before replicating the other document, and for the other document to then be rejected by the target DB.
> 
> Somebody asked about this on Stack Overflow. It was a very simple but
> challenging question, but now I can't find it. Basically, he made your
> point above.
> 
> Aren't you identifying two problems, though?
> 
> 1. Sometimes you need to ignore validation to just make a nice, clean copy.
> 2. Replication batches (an optimization) are disobeying the change
> sequence, which can screw up the replica.

As far as I know the only reason one needs to ignore validation to make a nice clean copy is because the replicator does not guarantee the updates are applied on the target in the order they were received on the source.  It's all one issue to me.

> I responded to #1 already.
> 
> But my feeling about #2 is that the optimization goes too far.
> replication batches should always have boundaries immediately before
> and after design documents. In other words, batch all you want, but
> design documents [1] must always be in a batch size of 1. That will
> retain the semantics.
> 
> [1] Actually, the only ddocs needing their own private batches are
> those with a validate_doc_update field.

My standard retort to transaction boundaries is that there is no global ordering of events in a distributed system.  A clustered CouchDB can try to build a vector clock out of the change sequences of the individual servers and stick to that merged sequence during replication, but even then the ddoc entry in the feed could be "concurrent" with several other updates.  I rather like that the replicator aggressively mixes up the ordering of updates because it prevents us from making choices in the single-server case that aren't sensible in a cluster.

By the way, I don't consider this line of discussion presumptuous in the least.  Cheers,

Adam

Re: The replicator needs a superuser mode

Posted by Jason Smith <jh...@iriscouch.com>.

On Tue, Aug 16, 2011 at 9:26 PM, Adam Kocoloski <ko...@apache.org> wrote:
> One of the principal uses of the replicator is to "make this database look like that one".  We're unable to do that in the general case today because of the combination of validation functions and out-of-order document transfers.  It's entirely possible for a document to be saved in the source DB prior to the installation of a ddoc containing a validation function that would have rejected the document, for the replicator to install the ddoc in the target DB before replicating the other document, and for the other document to then be rejected by the target DB.

Somebody asked about this on Stack Overflow. It was a very simple but
challenging question, but now I can't find it. Basically, he made your
point above.

Aren't you identifying two problems, though?

1. Sometimes you need to ignore validation to just make a nice, clean copy.
2. Replication batches (an optimization) are disobeying the change
sequence, which can screw up the replica.

I responded to #1 already.

But my feeling about #2 is that the optimization goes too far.
replication batches should always have boundaries immediately before
and after design documents. In other words, batch all you want, but
design documents [1] must always be in a batch size of 1. That will
retain the semantics.

[1] Actually, the only ddocs needing their own private batches are
those with a validate_doc_update field.

-- 
Iris Couch

Re: The replicator needs a superuser mode

Posted by Adam Kocoloski <ko...@apache.org>.

On Aug 17, 2011, at 11:46 AM, Jean-Pierre Fiset wrote:

> I think that the operations of replication and backing up are quite different. Although some are using the replication features for backing up, I tend to think of replication as an operation taking place between two nodes that do not necessarily trust one another.

That's one possible use case for replication, but hardly the only one.  Anyway, if you don't trust the replication then I certainly hope the replication doesn't use credentials that map to _admin powers on your database.  If the replication doesn't have _admin powers it cannot bypass validation.

> If what you are proposing is a special privilege given to the admin party, then I do not have much of an issue with this, since administrators already have intimate access to the server. However, the concept of creating a new "replicator" role, which would supersede the validation functions is another thing.

Yes, I probably should have picked one approach and stuck with it.  Either way, my intent was that a replicator could bypass validation only if an admin had given it credentials that mapped to a powerful role (possibly _admin), *and* if the admin had explicitly asked for the replicator to bypass validation.

> In applications that must ensure that some document types have a given structure, opening the door to a user (and here I assume a user that attempts a replication from a different node, not a local administrator performing a back up) to work around the validation function is probably a bad idea.

That's not going to happen, unless you granted the user this really powerful role.  Don't do that.

> If the validation function could not be counted on, it would really affect the way an application must be written.

Understood, I'm certainly not asking for the replicator to bypass validations in general.  Cheers,

Adam

Re: The replicator needs a superuser mode

Posted by Jean-Pierre Fiset <jp...@fiset.ca>.

I think that the operations of replication and backing up are quite different. Although some are
using the replication features for backing up, I tend to think of replication as an operation
taking place between two nodes that do not necessarily trust one another.

If what you are proposing is a special privilege given to the admin party, then I do not have
much of an issue with this, since administrators already have intimate access to the server.
However, the concept of creating a new "replicator" role, which would supersede the validation
functions is another thing.

In applications that must ensure that some document types have a given structure, opening the
door to a user (and here I assume a user that attempts a replication from a different node, not
a local administrator performing a back up) to work around the validation function is probably a
bad idea. If the validation function could not be counted on, it would really affect the way an
application must be written.

JP

On 11-08-16 02:40 PM, Adam Kocoloski wrote:
> Hi Jean-Pierre, I'm not quite sure I follow that line of reasoning.  A user with _admin privileges on the database can easily remove any validation functions prior to writing today.  In my proposal skipping validation would require _admin rights and an explicit opt-in on a per-request basis.  What are you trying to guard against with those validation functions?  Best,
> 
> Adam
> 
> On Aug 16, 2011, at 2:29 PM, Jean-Pierre Fiset wrote:
> 
>> I understand the issue brought by Adam since in our CouchDb application, there is a need to have a replicator role and the validation functions skip most of the tests if the role is set for the current user.
>>
>> On the other hand, at the current time, I am not in favour of making super users for the sake of replication. Although it might solve the particular problem stated, it removes the ability for a design document to enforce some "invariant" properties of a database.
>>
>> Since there is already a way to allow a "replicator" to perform any changes (role + proper validation function), I do not see the need for this change. Since the super replicator user removes the ability that a database has to protect the consistency of its data, and that there does not seem to be a work-around, I would rather not see this change pushed to CouchDb.
>>
>> JP
>>
>> On 11-08-16 10:26 AM, Adam Kocoloski wrote:
>>> One of the principal uses of the replicator is to "make this database look like that one".  We're unable to do that in the general case today because of the combination of validation functions and out-of-order document transfers.  It's entirely possible for a document to be saved in the source DB prior to the installation of a ddoc containing a validation function that would have rejected the document, for the replicator to install the ddoc in the target DB before replicating the other document, and for the other document to then be rejected by the target DB.
>>>
>>> I propose we add a role which allows a user to bypass validation, or else extend that privilege to the _admin role.  We should still validate updates by default and add a way (a new qs param, for instance) to indicate that validation should be skipped for a particular update.  Thoughts?
>>>
>>> Adam
>>
>

Re: The replicator needs a superuser mode

Posted by Dale Harvey <da...@arandomurl.com>.

On 17 August 2011 02:47, Adam Kocoloski <ko...@apache.org> wrote:

> On Aug 16, 2011, at 8:20 PM, Randall Leeds wrote:
>
> > On Tue, Aug 16, 2011 at 17:03, Adam Kocoloski <ko...@apache.org>
> wrote:
> >
> >> On Aug 16, 2011, at 5:46 PM, Randall Leeds wrote:
> >>
> >>> -1 on _skip_validation and new role
> >>>
> >>> One can always write a validation document that considers the role, no?
> >> Why
> >>> can't users who need this functionality craft a validation function for
> >> this
> >>> purpose? This sounds like a blog post and not a database feature.
> >>
> >> Blech, really?
> >>
> >> Q: What request do I issue to guarantee all my documents are stored in
> this
> >> other database?
> >>
> >> A: Unpossible.
> >>
> >> Practically speaking we need it at Cloudant because we use replication
> to
> >> move users' databases between clusters.  If it's not seen as generally
> >> useful that's ok, just surprising.  Best,
> >>
> >
> > I understand the motivation a little better now. I'm not sure it's
> generally
> > useful. I think _dump/_load might be, but I'd rather see users craft
> around
> > validation as part of their replication strategy rather than increase the
> > query option population.
> >
> > I'm not sure I'm against admin user context bypassing validation docs,
> > though.
>
> That's interesting.  It sounds like you're motivated to minimize the
> surface area of the API.  I can respect that.  I'm not sure I like _admins
> automatically bypassing validation, though, because we already require
> _admin to update _design docs, so it's not as if we can make the use of
> _admin particularly rare.  Will think on it.  Best,
>
> Adam


Just to point out a very useful usecase for /_dump /_load endpoint, on
mobile we need to ship preloaded data / applications, I originally curl'd
design docs and PUT them on starteup, but the resulting files are large and
startup time is slow, replicating isnt an option.

Now we use .couch files to preload data, however all my stuff is in a hosted
server where I dont have access to scp (I can just copy them down to servers
where I can access .couch files, but speaking on behalf of new users /
making things as easy as possible)

Re: The replicator needs a superuser mode

Posted by Adam Kocoloski <ko...@apache.org>.

On Aug 16, 2011, at 8:20 PM, Randall Leeds wrote:

> On Tue, Aug 16, 2011 at 17:03, Adam Kocoloski <ko...@apache.org> wrote:
> 
>> On Aug 16, 2011, at 5:46 PM, Randall Leeds wrote:
>> 
>>> -1 on _skip_validation and new role
>>> 
>>> One can always write a validation document that considers the role, no?
>> Why
>>> can't users who need this functionality craft a validation function for
>> this
>>> purpose? This sounds like a blog post and not a database feature.
>> 
>> Blech, really?
>> 
>> Q: What request do I issue to guarantee all my documents are stored in this
>> other database?
>> 
>> A: Unpossible.
>> 
>> Practically speaking we need it at Cloudant because we use replication to
>> move users' databases between clusters.  If it's not seen as generally
>> useful that's ok, just surprising.  Best,
>> 
> 
> I understand the motivation a little better now. I'm not sure it's generally
> useful. I think _dump/_load might be, but I'd rather see users craft around
> validation as part of their replication strategy rather than increase the
> query option population.
> 
> I'm not sure I'm against admin user context bypassing validation docs,
> though.

That's interesting.  It sounds like you're motivated to minimize the surface area of the API.  I can respect that.  I'm not sure I like _admins automatically bypassing validation, though, because we already require _admin to update _design docs, so it's not as if we can make the use of _admin particularly rare.  Will think on it.  Best,

Adam

Re: The replicator needs a superuser mode

Posted by Randall Leeds <ra...@gmail.com>.

On Tue, Aug 16, 2011 at 17:03, Adam Kocoloski <ko...@apache.org> wrote:

> On Aug 16, 2011, at 5:46 PM, Randall Leeds wrote:
>
> > -1 on _skip_validation and new role
> >
> > One can always write a validation document that considers the role, no?
> Why
> > can't users who need this functionality craft a validation function for
> this
> > purpose? This sounds like a blog post and not a database feature.
>
> Blech, really?
>
> Q: What request do I issue to guarantee all my documents are stored in this
> other database?
>
> A: Unpossible.
>
> Practically speaking we need it at Cloudant because we use replication to
> move users' databases between clusters.  If it's not seen as generally
> useful that's ok, just surprising.  Best,
>

I understand the motivation a little better now. I'm not sure it's generally
useful. I think _dump/_load might be, but I'd rather see users craft around
validation as part of their replication strategy rather than increase the
query option population.

I'm not sure I'm against admin user context bypassing validation docs,
though.

Re: The replicator needs a superuser mode

Posted by Jason Smith <jh...@iriscouch.com>.

On Wed, Aug 17, 2011 at 9:11 PM, Noah Slater <ns...@apache.org> wrote:
>
> On 17 Aug 2011, at 11:06, Benoit Chesneau wrote:
>
>> Philosophy apart, dump and restore could be indeed useful to bootstrap
>> db, make plain backup/restore strategies, exchange dbs over a disk/mem
>> card without any couch installed etc.
>
> Yep, but in my mind this should live outside CouchDB's HTTP API. A dump and restore tool that lived on the command line, like the Subversion hotcopy stuff is the first thing that springs to mind. Or PostgreSQL's pgdump tool, or whatever. But as far as I understand the current file format, you should be able to just rsync the .couch files while the database is running.

A small note, rsync copies the _security object and also _local docs.
The latter are AFAIK only used by the replicator, and if you rsync to
a different URL, those docs are pretty inert.

It's not clear to me whether _security should travel with a database
dump. It seems prudent to want to back that up. But if I restore to a
different couch, it is imperative that I remember to correct the
_security. The new couch (generally) has totally different user
accounts and roles defined.

Yet despite my initial disagreement with _dump, Adam has reminded or
persuaded me (FWIW) that Couch really needs a better mechanism to
clone or copy data efficiently.

-- 
Iris Couch

Re: The replicator needs a superuser mode

Posted by Chris Anderson <jc...@apache.org>.

I see nothing wrong with an admin capability that allows you to break
the rules if it is more convenient for common operations. This is not
really a replicator thing, what we're talking about is giving the
_admin power to invoke skip_validations=true on document updates. This
can be useful for bulk loading known valid data, for instance. Or if
you are an admin and you aren't concerned with validity, just want to
give the user a faithful copy of the wacky invalid database.

So I think we can add that, but it's not really a replicator feature,
let's not get confused.

As far as the _dump and _load we talk about here, it should be
possible to take a _changes feed with include docs and multipart
attachments, and pipe it through gzip, to get the JSON _dump artifact.
I think this is a really good idea.

Another even simpler option that I think is kinda fun, is the ability
to replicate by copying the .couch file to the target. This would only
work if the target didn't exist yet. So you'd copy the (hopefully
compacted) .couch file to the new host, and it would put it into
place, and trigger continuous replication to catch up from there.

For quick launch scenarios (wanna spin up a worker on some box, and
the way to do that is to give it a database over there? you could push
the .couch file, with the worker's views and initial data in it, and
then use traditional replication to merge in the data pertaining to
the woker's particular job.)

So I think there are a lot of legitimate uses for non-Couch API ways
of getting at the data. If there is strong demand from users for one
or another option, we should consider including it.

Chris


-- 
Chris Anderson
http://jchrisa.net
http://couchbase.com

Re: The replicator needs a superuser mode

Posted by Amos Hayes <ah...@gcrc.carleton.ca>.

We've strayed onto the topic of backup/restore/import/export and someone mentioned pg_dump, so I'll toss this out there.

As a long time user of PostgreSQL and their import/export tools, I'd definitely suggest having a look at what options have evolved in their tool too get a feel for what people might want from a couch tool in the long run. Dumping users, schema, data, privileges, blobs, etc. or disabling triggers, etc. are all options specified at the command line. Many of these options could be mapped to couch concepts such as design docs, security objects, attachments, validation processing, etc.

pg_dump is also careful to dump things in the order required for proper import. For couch, this might mean dumping out in such a way that future imports would see data loaded before design documents, etc.

Whether couchdb winds up with a command line tool or an authenticated URL with parameters and possibly automation is an interesting question too. The former could well be built on the latter.

FYI. Below is the pg_dump command line help from PostgreSQL 8.4 for reference. I hope it's helpful.

--
Amos

-----------------------------------------------------------------------------------------------------------
pg_dump dumps a database as a text file or to other formats.

Usage:
  pg_dump [OPTION]... [DBNAME]

General options:
  -f, --file=FILENAME      output file name
  -F, --format=c|t|p       output file format (custom, tar, plain text)
  -i, --ignore-version     proceed even when server version mismatches
                           pg_dump version
  -v, --verbose            verbose mode
  -Z, --compress=0-9       compression level for compressed formats
  --help                   show this help, then exit
  --version                output version information, then exit

Options controlling the output content:
  -a, --data-only             dump only the data, not the schema
  -b, --blobs                 include large objects in dump
  -c, --clean                 clean (drop) schema prior to create
  -C, --create                include commands to create database in dump
  -d, --inserts               dump data as INSERT commands, rather than COPY
  -D, --column-inserts        dump data as INSERT commands with column names
  -E, --encoding=ENCODING     dump the data in encoding ENCODING
  -n, --schema=SCHEMA         dump the named schema(s) only
  -N, --exclude-schema=SCHEMA do NOT dump the named schema(s)
  -o, --oids                  include OIDs in dump
  -O, --no-owner              skip restoration of object ownership
                              in plain text format
  -s, --schema-only           dump only the schema, no data
  -S, --superuser=NAME        specify the superuser user name to use in
                              plain text format
  -t, --table=TABLE           dump the named table(s) only
  -T, --exclude-table=TABLE   do NOT dump the named table(s)
  -x, --no-privileges         do not dump privileges (grant/revoke)
  --disable-dollar-quoting    disable dollar quoting, use SQL standard quoting
  --disable-triggers          disable triggers during data-only restore
  --use-set-session-authorization
                              use SESSION AUTHORIZATION commands instead of
                              ALTER OWNER commands to set ownership

Connection options:
  -h, --host=HOSTNAME      database server host or socket directory
  -p, --port=PORT          database server port number
  -U, --username=NAME      connect as specified database user
  -W, --password           force password prompt (should happen automatically)

If no database name is supplied, then the PGDATABASE environment
variable value is used.

Report bugs to <pg...@postgresql.org>.


-----------------------------------------------------------------------------------------------------------





On 2011-08-17, at 10:11 AM, Noah Slater wrote:

> 
> On 17 Aug 2011, at 11:06, Benoit Chesneau wrote:
> 
>> Philosophy apart, dump and restore could be indeed useful to bootstrap
>> db, make plain backup/restore strategies, exchange dbs over a disk/mem
>> card without any couch installed etc.
> 
> Yep, but in my mind this should live outside CouchDB's HTTP API. A dump and restore tool that lived on the command line, like the Subversion hotcopy stuff is the first thing that springs to mind. Or PostgreSQL's pgdump tool, or whatever. But as far as I understand the current file format, you should be able to just rsync the .couch files while the database is running.
>

Re: The replicator needs a superuser mode

Posted by Noah Slater <ns...@apache.org>.

On 17 Aug 2011, at 11:06, Benoit Chesneau wrote:

> Philosophy apart, dump and restore could be indeed useful to bootstrap
> db, make plain backup/restore strategies, exchange dbs over a disk/mem
> card without any couch installed etc.

Yep, but in my mind this should live outside CouchDB's HTTP API. A dump and restore tool that lived on the command line, like the Subversion hotcopy stuff is the first thing that springs to mind. Or PostgreSQL's pgdump tool, or whatever. But as far as I understand the current file format, you should be able to just rsync the .couch files while the database is running.

Re: The replicator needs a superuser mode

Posted by Benoit Chesneau <bc...@gmail.com>.

On Wed, Aug 17, 2011 at 5:37 AM, Jason Smith <jh...@iriscouch.com> wrote:
> tl;dr response here, philosophical musings below.
>
> 1. The requirements are real, it's reasonable to want to copy from A to B
> 2. Replication is a whole worldview, adding ?force=true breaks that worldview
> 3. Dump and restore sounds more appropriate
>
> On Wed, Aug 17, 2011 at 9:34 AM, Adam Kocoloski <ko...@apache.org> wrote:
>>> But to "guarantee all my documents are stored in this other database"
>>> is actually incoherent. It is IMHO anti-CouchDB.
>>
>> Hi Jason, we're going to have to disagree on this one.  Replication is really flexible and can do lots of things that database replication has not historically been able to do, but I think it's a sad state of affairs that it's not possible to use replication to create a replica of an arbitrary database.
>
> True. I agree with the requirements, but the solution raises a red flag.
>
> My understanding of couch:
>
> There is no such thing as a database (or data set) clone. There is no
> such thing as a database copy. There is no such thing as two databases
> with the same document.

At some time there can be clone or frozen dbs. So sure it can exists.
There are some use cases for that.

 It's like Pauli's exclusion principle. Sure,
> maybe the doc and rev history are the same, but the _security object,
> the authentication environment, and the URI are different. That
> (generally) affects how applications and validation works.

But that can happen differently.

>
> Put another way, this idea is a leaky abstraction. I much prefer Jan's
> _dump and _restore idea. It has some difficulties, but it is *not*
> replication. It's something totally different. In the universe of a
> database, replication always follows the rules. In the universe of a
> Couch, sure, sometimes you need to clone data around. There's an
> appropriate action for each abstraction layer.
>
> The nice thing about _dump and _restore, and also rsync, is that you
> make full, opaque clones (not replicas!). You can't merge or splice
> data sets. Once you are talking about merging data, or pulling out a
> subset, now you are in database land, not couch land, and you have to
> follow the rules of replication.
>

Philosophy apart, dump and restore could be indeed useful to bootstrap
db, make plain backup/restore strategies, exchange dbs over a disk/mem
card without any couch installed etc.

I was playing with this idea last day since it will be useful in
refuge project (for other purpose than backup).

Features I drafted:
- ability to have a compacted dump
- dump can be done from a sequence and restored from
- dump can be merged in another db

Questions I still had to solve:
- What do we do with design document and especially validation
functions? Do we bypass them during the restore?
- What about security object? In some usecases it may be interrested
to lock the database to same kind of readers & .
- Would it be useful to crypt he dump from couch ? Or let this one to
an external program ?

-benoît

Re: The replicator needs a superuser mode

Posted by Adam Kocoloski <ko...@apache.org>.

On Aug 16, 2011, at 11:37 PM, Jason Smith wrote:

> 2. Replication is a whole worldview, adding ?force=true breaks that worldview

Replication is not a worldview, it's a mechanism by which documents are transferred between databases.  I think that's the crux of our disagreement.  Cheers,

Adam

Re: The replicator needs a superuser mode

Posted by Randall Leeds <ra...@gmail.com>.

On Tue, Aug 16, 2011 at 20:37, Jason Smith <jh...@iriscouch.com> wrote:

> The nice thing about _dump and _restore, and also rsync, is that you
> make full, opaque clones (not replicas!). You can't merge or splice
> data sets. Once you are talking about merging data, or pulling out a
> subset, now you are in database land, not couch land, and you have to
> follow the rules of replication.
>

Yeah, this is what I'm thinking, too. Except I'd reverse couch and database
:)

Re: The replicator needs a superuser mode

Posted by Jason Smith <jh...@iriscouch.com>.

tl;dr response here, philosophical musings below.

1. The requirements are real, it's reasonable to want to copy from A to B
2. Replication is a whole worldview, adding ?force=true breaks that worldview
3. Dump and restore sounds more appropriate

On Wed, Aug 17, 2011 at 9:34 AM, Adam Kocoloski <ko...@apache.org> wrote:
>> But to "guarantee all my documents are stored in this other database"
>> is actually incoherent. It is IMHO anti-CouchDB.
>
> Hi Jason, we're going to have to disagree on this one.  Replication is really flexible and can do lots of things that database replication has not historically been able to do, but I think it's a sad state of affairs that it's not possible to use replication to create a replica of an arbitrary database.

True. I agree with the requirements, but the solution raises a red flag.

My understanding of couch:

There is no such thing as a database (or data set) clone. There is no
such thing as a database copy. There is no such thing as two databases
with the same document. It's like Pauli's exclusion principle. Sure,
maybe the doc and rev history are the same, but the _security object,
the authentication environment, and the URI are different. That
(generally) affects how applications and validation works.

Put another way, this idea is a leaky abstraction. I much prefer Jan's
_dump and _restore idea. It has some difficulties, but it is *not*
replication. It's something totally different. In the universe of a
database, replication always follows the rules. In the universe of a
Couch, sure, sometimes you need to clone data around. There's an
appropriate action for each abstraction layer.

The nice thing about _dump and _restore, and also rsync, is that you
make full, opaque clones (not replicas!). You can't merge or splice
data sets. Once you are talking about merging data, or pulling out a
subset, now you are in database land, not couch land, and you have to
follow the rules of replication.

-- 
Iris Couch

Re: The replicator needs a superuser mode

Posted by Adam Kocoloski <ko...@apache.org>.

On Aug 16, 2011, at 10:23 PM, Jason Smith wrote:

> On Wed, Aug 17, 2011 at 7:03 AM, Adam Kocoloski <ko...@apache.org> wrote:
>> On Aug 16, 2011, at 5:46 PM, Randall Leeds wrote:
>> 
>>> -1 on _skip_validation and new role
>>> 
>>> One can always write a validation document that considers the role, no? Why
>>> can't users who need this functionality craft a validation function for this
>>> purpose? This sounds like a blog post and not a database feature.
>> 
>> Blech, really?
>> 
>> Q: What request do I issue to guarantee all my documents are stored in this other database?
>> 
>> A: Unpossible.
>> 
>> Practically speaking we need it at Cloudant because we use replication to move users' databases between clusters.  If it's not seen as generally useful that's ok, just surprising.  Best,
> 
> Adam, I'm conflicted. It feels presumptuous to disagree with you and
> the developers, which I've done a lot recently.
> 
> Also, I too struggle with migrating data, verbatim, between servers
> (between couches, and also between Linux boxes).
> 
> But to "guarantee all my documents are stored in this other database"
> is actually incoherent. It is IMHO anti-CouchDB.

Hi Jason, we're going to have to disagree on this one.  Replication is really flexible and can do lots of things that database replication has not historically been able to do, but I think it's a sad state of affairs that it's not possible to use replication to create a replica of an arbitrary database.

> Validation functions, user accounts (which change from couch to
> couch), and security objects (which also change from db to db, and
> couch to couch) all come together to decide whether a change is
> approved (valid). That is very powerful, and very fundamental.
> Providing this "guarantee" betrays the promise that Couch makes to
> developers.

No, it doesn't.  The "guarantee" presumes you have _admin access to the target database.  Developers shouldn't give that out, just like they shouldn't give out root access to the server itself.

> People are using validation functions for government compliance, to
> meet regulatory requirements (SOX, HIPAA). IIRC, you are proposing a
> query parameter for Couch to disregard those instructions.

Only if you have _admin access to the database, in which case you can already bypass validation or do whatever else you want to the data in that database if you're so inclined.

> Validation functions confirm not only authorization, but also
> well-formedness of the documents. So, again, in the real world, where
> many people use _admin accounts, adding a ?force=true parameter sounds
> dangerous.

Well, yes, it would be dangerous to use on every request.

> Do you worry whether, in the wild, people will use it more and more,
> like logging in to your workstation as root/Administrator? It
> eliminates daily annoyances but it is actually very risky behavior.

Meh.  If they choose to bypass their own validation functions that's their concern.  I don't lose sleep over it.

> Finally, yes, an admin can ultimately circumvent validation functions.
> But to me, that is the checks and balances of real life. If you forget
> your BIOS password, you can physically open the box and move a jumper.
> 
> I do agree about the need to move opaque data around. I disagree that
> a query parameter should allow it. I feel the hosting provider pain.
> The customer creates _design/angry with validate_doc_update:
> 
>    function(newDoc, oldDoc, userCtx, secObj) {
>        throw {forbidden: "I am _design/angry and I hate all documents!"};
>    }
> 
> And now I am responsible for replicating their data, unmolested, all
> over the place.
> 
> -- 
> Iris Couch

Re: The replicator needs a superuser mode

Posted by Jason Smith <jh...@iriscouch.com>.

On Wed, Aug 17, 2011 at 7:03 AM, Adam Kocoloski <ko...@apache.org> wrote:
> On Aug 16, 2011, at 5:46 PM, Randall Leeds wrote:
>
>> -1 on _skip_validation and new role
>>
>> One can always write a validation document that considers the role, no? Why
>> can't users who need this functionality craft a validation function for this
>> purpose? This sounds like a blog post and not a database feature.
>
> Blech, really?
>
> Q: What request do I issue to guarantee all my documents are stored in this other database?
>
> A: Unpossible.
>
> Practically speaking we need it at Cloudant because we use replication to move users' databases between clusters.  If it's not seen as generally useful that's ok, just surprising.  Best,

Adam, I'm conflicted. It feels presumptuous to disagree with you and
the developers, which I've done a lot recently.

Also, I too struggle with migrating data, verbatim, between servers
(between couches, and also between Linux boxes).

But to "guarantee all my documents are stored in this other database"
is actually incoherent. It is IMHO anti-CouchDB.

Validation functions, user accounts (which change from couch to
couch), and security objects (which also change from db to db, and
couch to couch) all come together to decide whether a change is
approved (valid). That is very powerful, and very fundamental.
Providing this "guarantee" betrays the promise that Couch makes to
developers.

People are using validation functions for government compliance, to
meet regulatory requirements (SOX, HIPAA). IIRC, you are proposing a
query parameter for Couch to disregard those instructions.

Validation functions confirm not only authorization, but also
well-formedness of the documents. So, again, in the real world, where
many people use _admin accounts, adding a ?force=true parameter sounds
dangerous.

Do you worry whether, in the wild, people will use it more and more,
like logging in to your workstation as root/Administrator? It
eliminates daily annoyances but it is actually very risky behavior.

Finally, yes, an admin can ultimately circumvent validation functions.
But to me, that is the checks and balances of real life. If you forget
your BIOS password, you can physically open the box and move a jumper.

I do agree about the need to move opaque data around. I disagree that
a query parameter should allow it. I feel the hosting provider pain.
The customer creates _design/angry with validate_doc_update:

    function(newDoc, oldDoc, userCtx, secObj) {
        throw {forbidden: "I am _design/angry and I hate all documents!"};
    }

And now I am responsible for replicating their data, unmolested, all
over the place.

-- 
Iris Couch

Re: The replicator needs a superuser mode

Posted by Adam Kocoloski <ko...@apache.org>.

On Aug 16, 2011, at 5:46 PM, Randall Leeds wrote:

> -1 on _skip_validation and new role
> 
> One can always write a validation document that considers the role, no? Why
> can't users who need this functionality craft a validation function for this
> purpose? This sounds like a blog post and not a database feature.

Blech, really?

Q: What request do I issue to guarantee all my documents are stored in this other database?

A: Unpossible.

Practically speaking we need it at Cloudant because we use replication to move users' databases between clusters.  If it's not seen as generally useful that's ok, just surprising.  Best,

Adam

Re: The replicator needs a superuser mode

Posted by Randall Leeds <ra...@gmail.com>.

On Tue, Aug 16, 2011 at 16:23, Paul Davis <pa...@gmail.com>wrote:

> On Tue, Aug 16, 2011 at 4:46 PM, Randall Leeds <ra...@gmail.com>
> wrote:
> > -1 on _skip_validation and new role
> >
> > One can always write a validation document that considers the role, no?
> Why
> > can't users who need this functionality craft a validation function for
> this
> > purpose? This sounds like a blog post and not a database feature.
> >
> > +0 on _dump/_load
> >
> > If it ships raw .couch files I'm totally against it because I think the
> HTTP
> > API should remain as independent of implementation details as possible.
> If
> > it is non-incremental I don't see significant benefit, unless it's just
> to
> > traverse the document index and ignore the sequence index as a way to
> skip
> > reads, but this seems like a weak argument. If it's incremental, well,
> then,
> > that's replication, and we already have that.
> >
>
> Think of plain text backups and last resort upgrade paths. Also, it
> wouldn't have validation docs run on it  or anything of that nature.
> I'm thinking basically of having a multipart/mime stream
> representation of the database that follows the update sequence. And
> the _dump would allow for a ?since= parameter that would make it
> incremental. This would even give people the ability to do daily logs
> and so on.
>

Right-o. I don't feel strongly about it, like I said, and think it could be
easily crafted as a plugin if we get *that* situation sorted out.
How's my assessment of the need for a special role or validation skipping,
though? Am I right that one could just create a smart validation function?


>
> > -Randall
> >
> >
> > On Tue, Aug 16, 2011 at 11:40, Adam Kocoloski <ko...@apache.org>
> wrote:
> >
> >> Hi Jean-Pierre, I'm not quite sure I follow that line of reasoning.  A
> user
> >> with _admin privileges on the database can easily remove any validation
> >> functions prior to writing today.  In my proposal skipping validation
> would
> >> require _admin rights and an explicit opt-in on a per-request basis.
>  What
> >> are you trying to guard against with those validation functions?  Best,
> >>
> >> Adam
> >>
> >> On Aug 16, 2011, at 2:29 PM, Jean-Pierre Fiset wrote:
> >>
> >> > I understand the issue brought by Adam since in our CouchDb
> application,
> >> there is a need to have a replicator role and the validation functions
> skip
> >> most of the tests if the role is set for the current user.
> >> >
> >> > On the other hand, at the current time, I am not in favour of making
> >> super users for the sake of replication. Although it might solve the
> >> particular problem stated, it removes the ability for a design document
> to
> >> enforce some "invariant" properties of a database.
> >> >
> >> > Since there is already a way to allow a "replicator" to perform any
> >> changes (role + proper validation function), I do not see the need for
> this
> >> change. Since the super replicator user removes the ability that a
> database
> >> has to protect the consistency of its data, and that there does not seem
> to
> >> be a work-around, I would rather not see this change pushed to CouchDb.
> >> >
> >> > JP
> >> >
> >> > On 11-08-16 10:26 AM, Adam Kocoloski wrote:
> >> >> One of the principal uses of the replicator is to "make this database
> >> look like that one".  We're unable to do that in the general case today
> >> because of the combination of validation functions and out-of-order
> document
> >> transfers.  It's entirely possible for a document to be saved in the
> source
> >> DB prior to the installation of a ddoc containing a validation function
> that
> >> would have rejected the document, for the replicator to install the ddoc
> in
> >> the target DB before replicating the other document, and for the other
> >> document to then be rejected by the target DB.
> >> >>
> >> >> I propose we add a role which allows a user to bypass validation, or
> >> else extend that privilege to the _admin role.  We should still validate
> >> updates by default and add a way (a new qs param, for instance) to
> indicate
> >> that validation should be skipped for a particular update.  Thoughts?
> >> >>
> >> >> Adam
> >> >
> >>
> >>
> >
>

Re: The replicator needs a superuser mode

Posted by Paul Davis <pa...@gmail.com>.

On Tue, Aug 16, 2011 at 4:46 PM, Randall Leeds <ra...@gmail.com> wrote:
> -1 on _skip_validation and new role
>
> One can always write a validation document that considers the role, no? Why
> can't users who need this functionality craft a validation function for this
> purpose? This sounds like a blog post and not a database feature.
>
> +0 on _dump/_load
>
> If it ships raw .couch files I'm totally against it because I think the HTTP
> API should remain as independent of implementation details as possible. If
> it is non-incremental I don't see significant benefit, unless it's just to
> traverse the document index and ignore the sequence index as a way to skip
> reads, but this seems like a weak argument. If it's incremental, well, then,
> that's replication, and we already have that.
>

Think of plain text backups and last resort upgrade paths. Also, it
wouldn't have validation docs run on it  or anything of that nature.
I'm thinking basically of having a multipart/mime stream
representation of the database that follows the update sequence. And
the _dump would allow for a ?since= parameter that would make it
incremental. This would even give people the ability to do daily logs
and so on.

> -Randall
>
>
> On Tue, Aug 16, 2011 at 11:40, Adam Kocoloski <ko...@apache.org> wrote:
>
>> Hi Jean-Pierre, I'm not quite sure I follow that line of reasoning.  A user
>> with _admin privileges on the database can easily remove any validation
>> functions prior to writing today.  In my proposal skipping validation would
>> require _admin rights and an explicit opt-in on a per-request basis.  What
>> are you trying to guard against with those validation functions?  Best,
>>
>> Adam
>>
>> On Aug 16, 2011, at 2:29 PM, Jean-Pierre Fiset wrote:
>>
>> > I understand the issue brought by Adam since in our CouchDb application,
>> there is a need to have a replicator role and the validation functions skip
>> most of the tests if the role is set for the current user.
>> >
>> > On the other hand, at the current time, I am not in favour of making
>> super users for the sake of replication. Although it might solve the
>> particular problem stated, it removes the ability for a design document to
>> enforce some "invariant" properties of a database.
>> >
>> > Since there is already a way to allow a "replicator" to perform any
>> changes (role + proper validation function), I do not see the need for this
>> change. Since the super replicator user removes the ability that a database
>> has to protect the consistency of its data, and that there does not seem to
>> be a work-around, I would rather not see this change pushed to CouchDb.
>> >
>> > JP
>> >
>> > On 11-08-16 10:26 AM, Adam Kocoloski wrote:
>> >> One of the principal uses of the replicator is to "make this database
>> look like that one".  We're unable to do that in the general case today
>> because of the combination of validation functions and out-of-order document
>> transfers.  It's entirely possible for a document to be saved in the source
>> DB prior to the installation of a ddoc containing a validation function that
>> would have rejected the document, for the replicator to install the ddoc in
>> the target DB before replicating the other document, and for the other
>> document to then be rejected by the target DB.
>> >>
>> >> I propose we add a role which allows a user to bypass validation, or
>> else extend that privilege to the _admin role.  We should still validate
>> updates by default and add a way (a new qs param, for instance) to indicate
>> that validation should be skipped for a particular update.  Thoughts?
>> >>
>> >> Adam
>> >
>>
>>
>

Re: The replicator needs a superuser mode

Posted by Randall Leeds <ra...@gmail.com>.

-1 on _skip_validation and new role

One can always write a validation document that considers the role, no? Why
can't users who need this functionality craft a validation function for this
purpose? This sounds like a blog post and not a database feature.

+0 on _dump/_load

If it ships raw .couch files I'm totally against it because I think the HTTP
API should remain as independent of implementation details as possible. If
it is non-incremental I don't see significant benefit, unless it's just to
traverse the document index and ignore the sequence index as a way to skip
reads, but this seems like a weak argument. If it's incremental, well, then,
that's replication, and we already have that.

-Randall


On Tue, Aug 16, 2011 at 11:40, Adam Kocoloski <ko...@apache.org> wrote:

> Hi Jean-Pierre, I'm not quite sure I follow that line of reasoning.  A user
> with _admin privileges on the database can easily remove any validation
> functions prior to writing today.  In my proposal skipping validation would
> require _admin rights and an explicit opt-in on a per-request basis.  What
> are you trying to guard against with those validation functions?  Best,
>
> Adam
>
> On Aug 16, 2011, at 2:29 PM, Jean-Pierre Fiset wrote:
>
> > I understand the issue brought by Adam since in our CouchDb application,
> there is a need to have a replicator role and the validation functions skip
> most of the tests if the role is set for the current user.
> >
> > On the other hand, at the current time, I am not in favour of making
> super users for the sake of replication. Although it might solve the
> particular problem stated, it removes the ability for a design document to
> enforce some "invariant" properties of a database.
> >
> > Since there is already a way to allow a "replicator" to perform any
> changes (role + proper validation function), I do not see the need for this
> change. Since the super replicator user removes the ability that a database
> has to protect the consistency of its data, and that there does not seem to
> be a work-around, I would rather not see this change pushed to CouchDb.
> >
> > JP
> >
> > On 11-08-16 10:26 AM, Adam Kocoloski wrote:
> >> One of the principal uses of the replicator is to "make this database
> look like that one".  We're unable to do that in the general case today
> because of the combination of validation functions and out-of-order document
> transfers.  It's entirely possible for a document to be saved in the source
> DB prior to the installation of a ddoc containing a validation function that
> would have rejected the document, for the replicator to install the ddoc in
> the target DB before replicating the other document, and for the other
> document to then be rejected by the target DB.
> >>
> >> I propose we add a role which allows a user to bypass validation, or
> else extend that privilege to the _admin role.  We should still validate
> updates by default and add a way (a new qs param, for instance) to indicate
> that validation should be skipped for a particular update.  Thoughts?
> >>
> >> Adam
> >
>
>

Re: The replicator needs a superuser mode

Posted by Adam Kocoloski <ko...@apache.org>.

Hi Jean-Pierre, I'm not quite sure I follow that line of reasoning.  A user with _admin privileges on the database can easily remove any validation functions prior to writing today.  In my proposal skipping validation would require _admin rights and an explicit opt-in on a per-request basis.  What are you trying to guard against with those validation functions?  Best,

Adam

On Aug 16, 2011, at 2:29 PM, Jean-Pierre Fiset wrote:

> I understand the issue brought by Adam since in our CouchDb application, there is a need to have a replicator role and the validation functions skip most of the tests if the role is set for the current user.
> 
> On the other hand, at the current time, I am not in favour of making super users for the sake of replication. Although it might solve the particular problem stated, it removes the ability for a design document to enforce some "invariant" properties of a database.
> 
> Since there is already a way to allow a "replicator" to perform any changes (role + proper validation function), I do not see the need for this change. Since the super replicator user removes the ability that a database has to protect the consistency of its data, and that there does not seem to be a work-around, I would rather not see this change pushed to CouchDb.
> 
> JP
> 
> On 11-08-16 10:26 AM, Adam Kocoloski wrote:
>> One of the principal uses of the replicator is to "make this database look like that one".  We're unable to do that in the general case today because of the combination of validation functions and out-of-order document transfers.  It's entirely possible for a document to be saved in the source DB prior to the installation of a ddoc containing a validation function that would have rejected the document, for the replicator to install the ddoc in the target DB before replicating the other document, and for the other document to then be rejected by the target DB.
>> 
>> I propose we add a role which allows a user to bypass validation, or else extend that privilege to the _admin role.  We should still validate updates by default and add a way (a new qs param, for instance) to indicate that validation should be skipped for a particular update.  Thoughts?
>> 
>> Adam
>

Re: The replicator needs a superuser mode

Posted by Jean-Pierre Fiset <jp...@fiset.ca>.

I understand the issue brought by Adam since in our CouchDb application, there is a need to have
a replicator role and the validation functions skip most of the tests if the role is set for the
current user.

On the other hand, at the current time, I am not in favour of making super users for the sake of
replication. Although it might solve the particular problem stated, it removes the ability for a
design document to enforce some "invariant" properties of a database.

Since there is already a way to allow a "replicator" to perform any changes (role + proper
validation function), I do not see the need for this change. Since the super replicator user
removes the ability that a database has to protect the consistency of its data, and that there
does not seem to be a work-around, I would rather not see this change pushed to CouchDb.

JP

On 11-08-16 10:26 AM, Adam Kocoloski wrote:
> One of the principal uses of the replicator is to "make this database look like that one".  We're unable to do that in the general case today because of the combination of validation functions and out-of-order document transfers.  It's entirely possible for a document to be saved in the source DB prior to the installation of a ddoc containing a validation function that would have rejected the document, for the replicator to install the ddoc in the target DB before replicating the other document, and for the other document to then be rejected by the target DB.
> 
> I propose we add a role which allows a user to bypass validation, or else extend that privilege to the _admin role.  We should still validate updates by default and add a way (a new qs param, for instance) to indicate that validation should be skipped for a particular update.  Thoughts?
> 
> Adam