You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by Robert Dionne <di...@dionne-associates.com> on 2011/01/23 13:04:42 UTC

Re: Idea: Piggyback doc on conflict

+1 

this sounds like an excellent idea.


On Jan 23, 2011, at 12:21 AM, kowsik wrote:

> I've been spending a fair bit of time on profiling the performance
> aspects of Couch. One common recurring theme is updating documents on
> a write-heavy site. This is currently what happens:
> 
> PUT /db/doc_id
>    <- 409 indicating conflict
> 
> loop do
>    GET /db/doc_id
>        <- 200
> 
>    PUT /db/doc_id
>        <- 201 (successful and returns the new _rev)
> end until we get a 201
> 
> What would be beneficial is if I can request the "current" doc during
> PUT like so:
> 
> PUT /db/doc_id?include_doc=true
>    <- 409 conflict (but the 'doc' at the current _rev is returned)
> 
> This would allow the caller to simply take the doc that was returned,
> update it and try PUT again (eliminate the extra GET). This is
> especially valuable when the app is on one geo and the db is in yet
> another (think couchone or cloudant).
> 
> 2 cents,
> 
> K.
> ---
> http://twitter.com/pcapr
> http://labs.mudynamics.com


Re: Idea: Piggyback doc on conflict

Posted by Jan Lehnardt <ja...@apache.org>.
On 23 Jan 2011, at 16:03, Robert Newson wrote:

> I can see utility in the original proposal, my only point is whether
> it would still be utile if other mechanisms were introduced. You've
> clarified that the two cases are distinct, so it seems reasonable to
> me now.
> 
> However, I'd hope we'd return the doc and not just the _rev. The only
> thing returning the rev itself will do is encourage blind overwrites.

You are correct of course. In my attempt to clarify one issue, 
I confuddled another. Sorry :)

Cheers
Jan
-- 


> 
> B.
> 
> On Sun, Jan 23, 2011 at 2:58 PM, Jan Lehnardt <ja...@apache.org> wrote:
>> The confusion point here is that there are two different types of conflict.
>> 
>> 1. _rev mismatch on regular write. — This is the quoted scenario. Returning the expected _rev with the error result allows to remove an extra GET request. The replicator never does this.
>> 
>> 2. _rev mismatch on replication write (or a client with the all_or_nothing option). The response is always 201, even when a conflict is created, the caller isn't being notified (or maybe it is, but the replicator isn't using this).
>> 
>> read-repair function assume that conflicts appear as in 2. Returning the expected _rev in a failed write only happens in 1.
>> 
>> Cheers
>> Jan
>> --
>> 
>> 
>> 
>> On 23 Jan 2011, at 14:23, Robert Dionne wrote:
>> 
>>> These are also interesting ideas, but I don't think they adequately satisfy this particular write-heavy scenario. The client receiving the 409 has in hand the doc they wished to write and may just to add a
>>> field or update one. A general resolve_conflict function is a good idea for certain collaborative environments but I don't think would handle this specific case.
>>> 
>>> Having the conflict causing update return the doc that caused it seems really ideal. I'm still +1 on it
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Jan 23, 2011, at 7:51 AM, Robert Newson wrote:
>>> 
>>>> Oooh, crosspost.
>>>> 
>>>> Had a similar chat on IRC last night.
>>>> 
>>>> I'm -0 on returning the doc during a 409 PUT just because I think
>>>> there are other options that might be preferred.
>>>> 
>>>> For example, allowing a read_repair function in ddocs, that would take
>>>> all conflicting revisions as input and return the resolved document as
>>>> output. Or allowing a resolve_conflict function that is called at the
>>>> moment of conflict creation, allowing it to be downgraded to a
>>>> non-conflicting update.
>>>> 
>>>> With either, or both, of those mechanisms, the proposed one here is unnecessary.
>>>> 
>>>> B.
>>>> 
>>>> On Sun, Jan 23, 2011 at 12:04 PM, Robert Dionne
>>>> <di...@dionne-associates.com> wrote:
>>>>> +1
>>>>> 
>>>>> this sounds like an excellent idea.
>>>>> 
>>>>> 
>>>>> On Jan 23, 2011, at 12:21 AM, kowsik wrote:
>>>>> 
>>>>>> I've been spending a fair bit of time on profiling the performance
>>>>>> aspects of Couch. One common recurring theme is updating documents on
>>>>>> a write-heavy site. This is currently what happens:
>>>>>> 
>>>>>> PUT /db/doc_id
>>>>>>   <- 409 indicating conflict
>>>>>> 
>>>>>> loop do
>>>>>>   GET /db/doc_id
>>>>>>       <- 200
>>>>>> 
>>>>>>   PUT /db/doc_id
>>>>>>       <- 201 (successful and returns the new _rev)
>>>>>> end until we get a 201
>>>>>> 
>>>>>> What would be beneficial is if I can request the "current" doc during
>>>>>> PUT like so:
>>>>>> 
>>>>>> PUT /db/doc_id?include_doc=true
>>>>>>   <- 409 conflict (but the 'doc' at the current _rev is returned)
>>>>>> 
>>>>>> This would allow the caller to simply take the doc that was returned,
>>>>>> update it and try PUT again (eliminate the extra GET). This is
>>>>>> especially valuable when the app is on one geo and the db is in yet
>>>>>> another (think couchone or cloudant).
>>>>>> 
>>>>>> 2 cents,
>>>>>> 
>>>>>> K.
>>>>>> ---
>>>>>> http://twitter.com/pcapr
>>>>>> http://labs.mudynamics.com
>>>>> 
>>>>> 
>>> 
>> 
>> 


Re: Idea: Piggyback doc on conflict

Posted by Robert Newson <ro...@gmail.com>.
I can see utility in the original proposal, my only point is whether
it would still be utile if other mechanisms were introduced. You've
clarified that the two cases are distinct, so it seems reasonable to
me now.

However, I'd hope we'd return the doc and not just the _rev. The only
thing returning the rev itself will do is encourage blind overwrites.

B.

On Sun, Jan 23, 2011 at 2:58 PM, Jan Lehnardt <ja...@apache.org> wrote:
> The confusion point here is that there are two different types of conflict.
>
> 1. _rev mismatch on regular write. — This is the quoted scenario. Returning the expected _rev with the error result allows to remove an extra GET request. The replicator never does this.
>
> 2. _rev mismatch on replication write (or a client with the all_or_nothing option). The response is always 201, even when a conflict is created, the caller isn't being notified (or maybe it is, but the replicator isn't using this).
>
> read-repair function assume that conflicts appear as in 2. Returning the expected _rev in a failed write only happens in 1.
>
> Cheers
> Jan
> --
>
>
>
> On 23 Jan 2011, at 14:23, Robert Dionne wrote:
>
>> These are also interesting ideas, but I don't think they adequately satisfy this particular write-heavy scenario. The client receiving the 409 has in hand the doc they wished to write and may just to add a
>> field or update one. A general resolve_conflict function is a good idea for certain collaborative environments but I don't think would handle this specific case.
>>
>> Having the conflict causing update return the doc that caused it seems really ideal. I'm still +1 on it
>>
>>
>>
>>
>>
>>
>> On Jan 23, 2011, at 7:51 AM, Robert Newson wrote:
>>
>>> Oooh, crosspost.
>>>
>>> Had a similar chat on IRC last night.
>>>
>>> I'm -0 on returning the doc during a 409 PUT just because I think
>>> there are other options that might be preferred.
>>>
>>> For example, allowing a read_repair function in ddocs, that would take
>>> all conflicting revisions as input and return the resolved document as
>>> output. Or allowing a resolve_conflict function that is called at the
>>> moment of conflict creation, allowing it to be downgraded to a
>>> non-conflicting update.
>>>
>>> With either, or both, of those mechanisms, the proposed one here is unnecessary.
>>>
>>> B.
>>>
>>> On Sun, Jan 23, 2011 at 12:04 PM, Robert Dionne
>>> <di...@dionne-associates.com> wrote:
>>>> +1
>>>>
>>>> this sounds like an excellent idea.
>>>>
>>>>
>>>> On Jan 23, 2011, at 12:21 AM, kowsik wrote:
>>>>
>>>>> I've been spending a fair bit of time on profiling the performance
>>>>> aspects of Couch. One common recurring theme is updating documents on
>>>>> a write-heavy site. This is currently what happens:
>>>>>
>>>>> PUT /db/doc_id
>>>>>   <- 409 indicating conflict
>>>>>
>>>>> loop do
>>>>>   GET /db/doc_id
>>>>>       <- 200
>>>>>
>>>>>   PUT /db/doc_id
>>>>>       <- 201 (successful and returns the new _rev)
>>>>> end until we get a 201
>>>>>
>>>>> What would be beneficial is if I can request the "current" doc during
>>>>> PUT like so:
>>>>>
>>>>> PUT /db/doc_id?include_doc=true
>>>>>   <- 409 conflict (but the 'doc' at the current _rev is returned)
>>>>>
>>>>> This would allow the caller to simply take the doc that was returned,
>>>>> update it and try PUT again (eliminate the extra GET). This is
>>>>> especially valuable when the app is on one geo and the db is in yet
>>>>> another (think couchone or cloudant).
>>>>>
>>>>> 2 cents,
>>>>>
>>>>> K.
>>>>> ---
>>>>> http://twitter.com/pcapr
>>>>> http://labs.mudynamics.com
>>>>
>>>>
>>
>
>

Re: Idea: Piggyback doc on conflict

Posted by Jan Lehnardt <ja...@apache.org>.
The confusion point here is that there are two different types of conflict.

1. _rev mismatch on regular write. — This is the quoted scenario. Returning the expected _rev with the error result allows to remove an extra GET request. The replicator never does this.

2. _rev mismatch on replication write (or a client with the all_or_nothing option). The response is always 201, even when a conflict is created, the caller isn't being notified (or maybe it is, but the replicator isn't using this).

read-repair function assume that conflicts appear as in 2. Returning the expected _rev in a failed write only happens in 1.

Cheers
Jan
-- 



On 23 Jan 2011, at 14:23, Robert Dionne wrote:

> These are also interesting ideas, but I don't think they adequately satisfy this particular write-heavy scenario. The client receiving the 409 has in hand the doc they wished to write and may just to add a 
> field or update one. A general resolve_conflict function is a good idea for certain collaborative environments but I don't think would handle this specific case.
> 
> Having the conflict causing update return the doc that caused it seems really ideal. I'm still +1 on it
> 
> 
> 
> 
> 
> 
> On Jan 23, 2011, at 7:51 AM, Robert Newson wrote:
> 
>> Oooh, crosspost.
>> 
>> Had a similar chat on IRC last night.
>> 
>> I'm -0 on returning the doc during a 409 PUT just because I think
>> there are other options that might be preferred.
>> 
>> For example, allowing a read_repair function in ddocs, that would take
>> all conflicting revisions as input and return the resolved document as
>> output. Or allowing a resolve_conflict function that is called at the
>> moment of conflict creation, allowing it to be downgraded to a
>> non-conflicting update.
>> 
>> With either, or both, of those mechanisms, the proposed one here is unnecessary.
>> 
>> B.
>> 
>> On Sun, Jan 23, 2011 at 12:04 PM, Robert Dionne
>> <di...@dionne-associates.com> wrote:
>>> +1
>>> 
>>> this sounds like an excellent idea.
>>> 
>>> 
>>> On Jan 23, 2011, at 12:21 AM, kowsik wrote:
>>> 
>>>> I've been spending a fair bit of time on profiling the performance
>>>> aspects of Couch. One common recurring theme is updating documents on
>>>> a write-heavy site. This is currently what happens:
>>>> 
>>>> PUT /db/doc_id
>>>>   <- 409 indicating conflict
>>>> 
>>>> loop do
>>>>   GET /db/doc_id
>>>>       <- 200
>>>> 
>>>>   PUT /db/doc_id
>>>>       <- 201 (successful and returns the new _rev)
>>>> end until we get a 201
>>>> 
>>>> What would be beneficial is if I can request the "current" doc during
>>>> PUT like so:
>>>> 
>>>> PUT /db/doc_id?include_doc=true
>>>>   <- 409 conflict (but the 'doc' at the current _rev is returned)
>>>> 
>>>> This would allow the caller to simply take the doc that was returned,
>>>> update it and try PUT again (eliminate the extra GET). This is
>>>> especially valuable when the app is on one geo and the db is in yet
>>>> another (think couchone or cloudant).
>>>> 
>>>> 2 cents,
>>>> 
>>>> K.
>>>> ---
>>>> http://twitter.com/pcapr
>>>> http://labs.mudynamics.com
>>> 
>>> 
> 


Re: Idea: Piggyback doc on conflict

Posted by Robert Dionne <di...@dionne-associates.com>.
These are also interesting ideas, but I don't think they adequately satisfy this particular write-heavy scenario. The client receiving the 409 has in hand the doc they wished to write and may just to add a 
field or update one. A general resolve_conflict function is a good idea for certain collaborative environments but I don't think would handle this specific case.

Having the conflict causing update return the doc that caused it seems really ideal. I'm still +1 on it






On Jan 23, 2011, at 7:51 AM, Robert Newson wrote:

> Oooh, crosspost.
> 
> Had a similar chat on IRC last night.
> 
> I'm -0 on returning the doc during a 409 PUT just because I think
> there are other options that might be preferred.
> 
> For example, allowing a read_repair function in ddocs, that would take
> all conflicting revisions as input and return the resolved document as
> output. Or allowing a resolve_conflict function that is called at the
> moment of conflict creation, allowing it to be downgraded to a
> non-conflicting update.
> 
> With either, or both, of those mechanisms, the proposed one here is unnecessary.
> 
> B.
> 
> On Sun, Jan 23, 2011 at 12:04 PM, Robert Dionne
> <di...@dionne-associates.com> wrote:
>> +1
>> 
>> this sounds like an excellent idea.
>> 
>> 
>> On Jan 23, 2011, at 12:21 AM, kowsik wrote:
>> 
>>> I've been spending a fair bit of time on profiling the performance
>>> aspects of Couch. One common recurring theme is updating documents on
>>> a write-heavy site. This is currently what happens:
>>> 
>>> PUT /db/doc_id
>>>    <- 409 indicating conflict
>>> 
>>> loop do
>>>    GET /db/doc_id
>>>        <- 200
>>> 
>>>    PUT /db/doc_id
>>>        <- 201 (successful and returns the new _rev)
>>> end until we get a 201
>>> 
>>> What would be beneficial is if I can request the "current" doc during
>>> PUT like so:
>>> 
>>> PUT /db/doc_id?include_doc=true
>>>    <- 409 conflict (but the 'doc' at the current _rev is returned)
>>> 
>>> This would allow the caller to simply take the doc that was returned,
>>> update it and try PUT again (eliminate the extra GET). This is
>>> especially valuable when the app is on one geo and the db is in yet
>>> another (think couchone or cloudant).
>>> 
>>> 2 cents,
>>> 
>>> K.
>>> ---
>>> http://twitter.com/pcapr
>>> http://labs.mudynamics.com
>> 
>> 


Re: Idea: Piggyback doc on conflict

Posted by Robert Newson <ro...@gmail.com>.
Oooh, crosspost.

Had a similar chat on IRC last night.

I'm -0 on returning the doc during a 409 PUT just because I think
there are other options that might be preferred.

For example, allowing a read_repair function in ddocs, that would take
all conflicting revisions as input and return the resolved document as
output. Or allowing a resolve_conflict function that is called at the
moment of conflict creation, allowing it to be downgraded to a
non-conflicting update.

With either, or both, of those mechanisms, the proposed one here is unnecessary.

B.

On Sun, Jan 23, 2011 at 12:04 PM, Robert Dionne
<di...@dionne-associates.com> wrote:
> +1
>
> this sounds like an excellent idea.
>
>
> On Jan 23, 2011, at 12:21 AM, kowsik wrote:
>
>> I've been spending a fair bit of time on profiling the performance
>> aspects of Couch. One common recurring theme is updating documents on
>> a write-heavy site. This is currently what happens:
>>
>> PUT /db/doc_id
>>    <- 409 indicating conflict
>>
>> loop do
>>    GET /db/doc_id
>>        <- 200
>>
>>    PUT /db/doc_id
>>        <- 201 (successful and returns the new _rev)
>> end until we get a 201
>>
>> What would be beneficial is if I can request the "current" doc during
>> PUT like so:
>>
>> PUT /db/doc_id?include_doc=true
>>    <- 409 conflict (but the 'doc' at the current _rev is returned)
>>
>> This would allow the caller to simply take the doc that was returned,
>> update it and try PUT again (eliminate the extra GET). This is
>> especially valuable when the app is on one geo and the db is in yet
>> another (think couchone or cloudant).
>>
>> 2 cents,
>>
>> K.
>> ---
>> http://twitter.com/pcapr
>> http://labs.mudynamics.com
>
>