You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@couchdb.apache.org by Liam Staskawicz <ls...@gmail.com> on 2008/11/26 18:57:24 UTC

partial/diff updates?

When updating a document, is there any notion of submitting a partial  
update?  It seems like being able to specify that only some subset of  
the fields in a document should be updated would offer some efficiency  
benefits.  I guess I had in mind some scenario where CouchDB would  
create the updated record by merging the existing revision with the  
new info and saving the updated revision, but I'm still new to CouchDB  
so I don't have a good sense of whether this tramples on any important  
concepts.

Thanks for any thoughts.

Liam

Re: partial/diff updates?

Posted by Noah Slater <ns...@apache.org>.

On Wed, Nov 26, 2008 at 01:05:57PM -0800, Liam Staskawicz wrote:
> Hm - maybe it's a different consideration for attachments, but I don't see why
> you would need to keep a diff around in the context of updating a document.
> Apply the updates from the PUT to the latest revision and then call it the new
> revision. Crazy?

Nope, I wouldn't see it working like this either.

I would imagine that the diff would be processed by CouchDB, and the new
document applied as the latest revision. No need to keep the diff.

-- 
Noah Slater, http://tumbolia.org/nslater

Re: partial/diff updates?

Posted by Liam Staskawicz <ls...@gmail.com>.

On Nov 26, 2008, at 12:40 PM, Dan wrote:

> Here is a conversation I had on the IRC channel #couchdb on this  
> subject on
> november 24, 2008 (2 days ago). Hope this helps!
>
> (04:01:26 PM) dsimard: I just  wanted to know if an attachment  
> changes, will
> the new revision contain just the "diff" with the old attachment or  
> the
> complete attachment?
> (04:01:49 PM) jan____: complete attachment. diffs are the devil
> (04:03:08 PM) dsimard: damn... all fields of a document are stored  
> as a full
> document?
> (04:03:18 PM) dsimard: I really thought that diffs were used
> (04:03:35 PM) jan____: no, no diffs. diffs are the devil
> (04:04:14 PM) dsimard: ok, could you elaborate on the evilness of  
> diffs?
> (04:04:44 PM) dsimard: I just want to know more about it
> (04:05:04 PM) jan____: dsimard: you need to keep diffs around  
> forever to
> construct the latest live doc. this totally conflicts with the couchdb
> storage model which uses full representations of each revision.
> (04:05:04 PM) dsimard: or if you have a good link about it
> (04:05:35 PM) jan____:
> http://incubator.apache.org/couchdb/docs/overview.html
> (04:05:36 PM) jan____: that one
>
> In my opinion, it would be a great addition to couchdb. But still, I  
> can't
> wait to use it on my next project.

Hm - maybe it's a different consideration for attachments, but I don't  
see why you would need to keep a diff around in the context of  
updating a document.  Apply the updates from the PUT to the latest  
revision and then call it the new revision.  Crazy?

Liam

Re: partial/diff updates?

Posted by Chris Anderson <jc...@apache.org>.

diffs - if we can get the transport right, they'd make a great use case.

The problem, as has been discussed in the archives, is a canonical
json format, or a json diff format, neither of which are obvious to
implement.

for attachments, binary diffs would be interesting, I wonder how
usable git's format would be for us?

On Wed, Nov 26, 2008 at 5:14 PM, Jan Lehnardt <ja...@apache.org> wrote:
>
> On 26 Nov 2008, at 12:40, Dan wrote:
>
>> Here is a conversation I had on the IRC channel #couchdb on this subject
>> on
>> november 24, 2008 (2 days ago). Hope this helps!
>
> I kinda don't like IRC quotes floating around, but hey, I didn't put up any
> disclaimers either. Take the following with a grain of salt :)
>
>
>> (04:01:26 PM) dsimard: I just  wanted to know if an attachment changes,
>> will
>> the new revision contain just the "diff" with the old attachment or the
>> complete attachment?
>> (04:01:49 PM) jan____: complete attachment. diffs are the devil
>> (04:03:08 PM) dsimard: damn... all fields of a document are stored as a
>> full
>> document?
>> (04:03:18 PM) dsimard: I really thought that diffs were used
>> (04:03:35 PM) jan____: no, no diffs. diffs are the devil
>> (04:04:14 PM) dsimard: ok, could you elaborate on the evilness of diffs?
>> (04:04:44 PM) dsimard: I just want to know more about it
>> (04:05:04 PM) jan____: dsimard: you need to keep diffs around forever to
>> construct the latest live doc. this totally conflicts with the couchdb
>> storage model which uses full representations of each revision.
>> (04:05:04 PM) dsimard: or if you have a good link about it
>> (04:05:35 PM) jan____:
>> http://incubator.apache.org/couchdb/docs/overview.html
>> (04:05:36 PM) jan____: that one
>>
>> In my opinion, it would be a great addition to couchdb. But still, I can't
>> wait to use it on my next project.
>
>
>
>
>>
>>
>>
>> On Wed, Nov 26, 2008 at 3:25 PM, Liam Staskawicz <ls...@gmail.com> wrote:
>>
>>>
>>> On Nov 26, 2008, at 12:20 PM, Noah Slater wrote:
>>>
>>> On Wed, Nov 26, 2008 at 09:57:24AM -0800, Liam Staskawicz wrote:
>>>>
>>>>> When updating a document, is there any notion of submitting a partial
>>>>> update?
>>>>> It seems like being able to specify that only some subset of the fields
>>>>> in a
>>>>> document should be updated would offer some efficiency benefits.  I
>>>>> guess
>>>>> I
>>>>> had in mind some scenario where CouchDB would create the updated record
>>>>> by
>>>>> merging the existing revision with the new info and saving the updated
>>>>> revision, but I'm still new to CouchDB so I don't have a good sense of
>>>>> whether
>>>>> this tramples on any important concepts.
>>>>>
>>>>
>>>> Nope, CouchDB does not support this at the moment. If you want to make
>>>> an
>>>> update
>>>> you have to send the entire document each time.
>>>>
>>>> There is some discussion among CouchDB users and developers about the
>>>> benefits
>>>> of partial updates but it seems the real sticking point so far is
>>>> deciding
>>>> on
>>>> the mechanism for enabling this. It seems the rough consensus at this
>>>> point is
>>>> that whatever method we use be something that is standardised, either
>>>> through a
>>>> standards body or de facto within the larger JSON community.
>>>>
>>>
>>> Thanks for the response - and yeah, this is not a sticking point at the
>>> moment but as systems start to ramp up this seems like a pretty good way
>>> to
>>> make the back and forths much more efficient.   Will be looking forward
>>> to
>>> this being introduced at some point.
>>>
>>> Liam
>>>
>
>



-- 
Chris Anderson
http://jchris.mfdz.com

Re: partial/diff updates?

Posted by Jan Lehnardt <ja...@apache.org>.

On 26 Nov 2008, at 12:40, Dan wrote:

> Here is a conversation I had on the IRC channel #couchdb on this  
> subject on
> november 24, 2008 (2 days ago). Hope this helps!

I kinda don't like IRC quotes floating around, but hey, I didn't put  
up any
disclaimers either. Take the following with a grain of salt :)


> (04:01:26 PM) dsimard: I just  wanted to know if an attachment  
> changes, will
> the new revision contain just the "diff" with the old attachment or  
> the
> complete attachment?
> (04:01:49 PM) jan____: complete attachment. diffs are the devil
> (04:03:08 PM) dsimard: damn... all fields of a document are stored  
> as a full
> document?
> (04:03:18 PM) dsimard: I really thought that diffs were used
> (04:03:35 PM) jan____: no, no diffs. diffs are the devil
> (04:04:14 PM) dsimard: ok, could you elaborate on the evilness of  
> diffs?
> (04:04:44 PM) dsimard: I just want to know more about it
> (04:05:04 PM) jan____: dsimard: you need to keep diffs around  
> forever to
> construct the latest live doc. this totally conflicts with the couchdb
> storage model which uses full representations of each revision.
> (04:05:04 PM) dsimard: or if you have a good link about it
> (04:05:35 PM) jan____:
> http://incubator.apache.org/couchdb/docs/overview.html
> (04:05:36 PM) jan____: that one
>
> In my opinion, it would be a great addition to couchdb. But still, I  
> can't
> wait to use it on my next project.




>
>
>
> On Wed, Nov 26, 2008 at 3:25 PM, Liam Staskawicz <ls...@gmail.com>  
> wrote:
>
>>
>> On Nov 26, 2008, at 12:20 PM, Noah Slater wrote:
>>
>> On Wed, Nov 26, 2008 at 09:57:24AM -0800, Liam Staskawicz wrote:
>>>
>>>> When updating a document, is there any notion of submitting a  
>>>> partial
>>>> update?
>>>> It seems like being able to specify that only some subset of the  
>>>> fields
>>>> in a
>>>> document should be updated would offer some efficiency benefits.   
>>>> I guess
>>>> I
>>>> had in mind some scenario where CouchDB would create the updated  
>>>> record
>>>> by
>>>> merging the existing revision with the new info and saving the  
>>>> updated
>>>> revision, but I'm still new to CouchDB so I don't have a good  
>>>> sense of
>>>> whether
>>>> this tramples on any important concepts.
>>>>
>>>
>>> Nope, CouchDB does not support this at the moment. If you want to  
>>> make an
>>> update
>>> you have to send the entire document each time.
>>>
>>> There is some discussion among CouchDB users and developers about  
>>> the
>>> benefits
>>> of partial updates but it seems the real sticking point so far is  
>>> deciding
>>> on
>>> the mechanism for enabling this. It seems the rough consensus at  
>>> this
>>> point is
>>> that whatever method we use be something that is standardised,  
>>> either
>>> through a
>>> standards body or de facto within the larger JSON community.
>>>
>>
>> Thanks for the response - and yeah, this is not a sticking point at  
>> the
>> moment but as systems start to ramp up this seems like a pretty  
>> good way to
>> make the back and forths much more efficient.   Will be looking  
>> forward to
>> this being introduced at some point.
>>
>> Liam
>>

Re: partial/diff updates?

Posted by Dan <ds...@gmail.com>.

Here is a conversation I had on the IRC channel #couchdb on this subject on
november 24, 2008 (2 days ago). Hope this helps!

(04:01:26 PM) dsimard: I just  wanted to know if an attachment changes, will
the new revision contain just the "diff" with the old attachment or the
complete attachment?
(04:01:49 PM) jan____: complete attachment. diffs are the devil
(04:03:08 PM) dsimard: damn... all fields of a document are stored as a full
document?
(04:03:18 PM) dsimard: I really thought that diffs were used
(04:03:35 PM) jan____: no, no diffs. diffs are the devil
(04:04:14 PM) dsimard: ok, could you elaborate on the evilness of diffs?
(04:04:44 PM) dsimard: I just want to know more about it
(04:05:04 PM) jan____: dsimard: you need to keep diffs around forever to
construct the latest live doc. this totally conflicts with the couchdb
storage model which uses full representations of each revision.
(04:05:04 PM) dsimard: or if you have a good link about it
(04:05:35 PM) jan____:
http://incubator.apache.org/couchdb/docs/overview.html
(04:05:36 PM) jan____: that one

In my opinion, it would be a great addition to couchdb. But still, I can't
wait to use it on my next project.

On Wed, Nov 26, 2008 at 3:25 PM, Liam Staskawicz <ls...@gmail.com> wrote:

>
> On Nov 26, 2008, at 12:20 PM, Noah Slater wrote:
>
>  On Wed, Nov 26, 2008 at 09:57:24AM -0800, Liam Staskawicz wrote:
>>
>>> When updating a document, is there any notion of submitting a partial
>>> update?
>>> It seems like being able to specify that only some subset of the fields
>>> in a
>>> document should be updated would offer some efficiency benefits.  I guess
>>> I
>>> had in mind some scenario where CouchDB would create the updated record
>>> by
>>> merging the existing revision with the new info and saving the updated
>>> revision, but I'm still new to CouchDB so I don't have a good sense of
>>> whether
>>> this tramples on any important concepts.
>>>
>>
>> Nope, CouchDB does not support this at the moment. If you want to make an
>> update
>> you have to send the entire document each time.
>>
>> There is some discussion among CouchDB users and developers about the
>> benefits
>> of partial updates but it seems the real sticking point so far is deciding
>> on
>> the mechanism for enabling this. It seems the rough consensus at this
>> point is
>> that whatever method we use be something that is standardised, either
>> through a
>> standards body or de facto within the larger JSON community.
>>
>
> Thanks for the response - and yeah, this is not a sticking point at the
> moment but as systems start to ramp up this seems like a pretty good way to
> make the back and forths much more efficient.   Will be looking forward to
> this being introduced at some point.
>
> Liam
>

Re: partial/diff updates?

Posted by Liam Staskawicz <ls...@gmail.com>.

On Nov 26, 2008, at 12:20 PM, Noah Slater wrote:

> On Wed, Nov 26, 2008 at 09:57:24AM -0800, Liam Staskawicz wrote:
>> When updating a document, is there any notion of submitting a  
>> partial update?
>> It seems like being able to specify that only some subset of the  
>> fields in a
>> document should be updated would offer some efficiency benefits.  I  
>> guess I
>> had in mind some scenario where CouchDB would create the updated  
>> record by
>> merging the existing revision with the new info and saving the  
>> updated
>> revision, but I'm still new to CouchDB so I don't have a good sense  
>> of whether
>> this tramples on any important concepts.
>
> Nope, CouchDB does not support this at the moment. If you want to  
> make an update
> you have to send the entire document each time.
>
> There is some discussion among CouchDB users and developers about  
> the benefits
> of partial updates but it seems the real sticking point so far is  
> deciding on
> the mechanism for enabling this. It seems the rough consensus at  
> this point is
> that whatever method we use be something that is standardised,  
> either through a
> standards body or de facto within the larger JSON community.

Thanks for the response - and yeah, this is not a sticking point at  
the moment but as systems start to ramp up this seems like a pretty  
good way to make the back and forths much more efficient.   Will be  
looking forward to this being introduced at some point.

Liam

Re: partial/diff updates?

Posted by Noah Slater <ns...@apache.org>.

On Wed, Nov 26, 2008 at 09:57:24AM -0800, Liam Staskawicz wrote:
> When updating a document, is there any notion of submitting a partial update?
> It seems like being able to specify that only some subset of the fields in a
> document should be updated would offer some efficiency benefits.  I guess I
> had in mind some scenario where CouchDB would create the updated record by
> merging the existing revision with the new info and saving the updated
> revision, but I'm still new to CouchDB so I don't have a good sense of whether
> this tramples on any important concepts.

Nope, CouchDB does not support this at the moment. If you want to make an update
you have to send the entire document each time.

There is some discussion among CouchDB users and developers about the benefits
of partial updates but it seems the real sticking point so far is deciding on
the mechanism for enabling this. It seems the rough consensus at this point is
that whatever method we use be something that is standardised, either through a
standards body or de facto within the larger JSON community.

Hope this helps,

-- 
Noah Slater, http://tumbolia.org/nslater

Re: partial/diff updates?

Posted by Antony Blakey <an...@gmail.com>.

Have a look at the archive for this list - the thread 'Document  
Updates' contains some pertinent discussion.

Antony Blakey
--------------------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

Always have a vision. Why spend your life making other people’s dreams?
  -- Orson Welles (1915-1985)

Re: partial/diff updates?

Posted by Paul Carey <pa...@gmail.com>.

> If I could do partial updates, I would store all the user_ids inside a
> field of a group doc such as:
>
> "memberships": [user_id1, user_id2, etc]
>
> I could do this right now, but what if I want to update or delete a
> membership? If a group has a million users

This model suggests that you would effectively be allowing a million
users to write to a single document, which is probably unwise if
contention-free writes are a goal of your application.

A better approach might be to store the group membership in each user
doc. A view could emit the group_id key against the user doc - this
would provide roughly the same data as a group doc containing an array
of user ids.

Paul