You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Liam Staskawicz <ls...@gmail.com> on 2008/11/26 18:57:24 UTC

partial/diff updates?

When updating a document, is there any notion of submitting a partial  
update?  It seems like being able to specify that only some subset of  
the fields in a document should be updated would offer some efficiency  
benefits.  I guess I had in mind some scenario where CouchDB would  
create the updated record by merging the existing revision with the  
new info and saving the updated revision, but I'm still new to CouchDB  
so I don't have a good sense of whether this tramples on any important  
concepts.

Thanks for any thoughts.

Liam

Re: partial/diff updates?

Posted by Noah Slater <ns...@apache.org>.
On Wed, Nov 26, 2008 at 01:05:57PM -0800, Liam Staskawicz wrote:
> Hm - maybe it's a different consideration for attachments, but I don't see why
> you would need to keep a diff around in the context of updating a document.
> Apply the updates from the PUT to the latest revision and then call it the new
> revision. Crazy?

Nope, I wouldn't see it working like this either.

I would imagine that the diff would be processed by CouchDB, and the new
document applied as the latest revision. No need to keep the diff.

-- 
Noah Slater, http://tumbolia.org/nslater

Re: partial/diff updates?

Posted by Liam Staskawicz <ls...@gmail.com>.
On Nov 26, 2008, at 12:40 PM, Dan wrote:

> Here is a conversation I had on the IRC channel #couchdb on this  
> subject on
> november 24, 2008 (2 days ago). Hope this helps!
>
> (04:01:26 PM) dsimard: I just  wanted to know if an attachment  
> changes, will
> the new revision contain just the "diff" with the old attachment or  
> the
> complete attachment?
> (04:01:49 PM) jan____: complete attachment. diffs are the devil
> (04:03:08 PM) dsimard: damn... all fields of a document are stored  
> as a full
> document?
> (04:03:18 PM) dsimard: I really thought that diffs were used
> (04:03:35 PM) jan____: no, no diffs. diffs are the devil
> (04:04:14 PM) dsimard: ok, could you elaborate on the evilness of  
> diffs?
> (04:04:44 PM) dsimard: I just want to know more about it
> (04:05:04 PM) jan____: dsimard: you need to keep diffs around  
> forever to
> construct the latest live doc. this totally conflicts with the couchdb
> storage model which uses full representations of each revision.
> (04:05:04 PM) dsimard: or if you have a good link about it
> (04:05:35 PM) jan____:
> http://incubator.apache.org/couchdb/docs/overview.html
> (04:05:36 PM) jan____: that one
>
> In my opinion, it would be a great addition to couchdb. But still, I  
> can't
> wait to use it on my next project.

Hm - maybe it's a different consideration for attachments, but I don't  
see why you would need to keep a diff around in the context of  
updating a document.  Apply the updates from the PUT to the latest  
revision and then call it the new revision.  Crazy?

Liam

Re: partial/diff updates?

Posted by Chris Anderson <jc...@apache.org>.
diffs - if we can get the transport right, they'd make a great use case.

The problem, as has been discussed in the archives, is a canonical
json format, or a json diff format, neither of which are obvious to
implement.

for attachments, binary diffs would be interesting, I wonder how
usable git's format would be for us?

On Wed, Nov 26, 2008 at 5:14 PM, Jan Lehnardt <ja...@apache.org> wrote:
>
> On 26 Nov 2008, at 12:40, Dan wrote:
>
>> Here is a conversation I had on the IRC channel #couchdb on this subject
>> on
>> november 24, 2008 (2 days ago). Hope this helps!
>
> I kinda don't like IRC quotes floating around, but hey, I didn't put up any
> disclaimers either. Take the following with a grain of salt :)
>
>
>> (04:01:26 PM) dsimard: I just  wanted to know if an attachment changes,
>> will
>> the new revision contain just the "diff" with the old attachment or the
>> complete attachment?
>> (04:01:49 PM) jan____: complete attachment. diffs are the devil
>> (04:03:08 PM) dsimard: damn... all fields of a document are stored as a
>> full
>> document?
>> (04:03:18 PM) dsimard: I really thought that diffs were used
>> (04:03:35 PM) jan____: no, no diffs. diffs are the devil
>> (04:04:14 PM) dsimard: ok, could you elaborate on the evilness of diffs?
>> (04:04:44 PM) dsimard: I just want to know more about it
>> (04:05:04 PM) jan____: dsimard: you need to keep diffs around forever to
>> construct the latest live doc. this totally conflicts with the couchdb
>> storage model which uses full representations of each revision.
>> (04:05:04 PM) dsimard: or if you have a good link about it
>> (04:05:35 PM) jan____:
>> http://incubator.apache.org/couchdb/docs/overview.html
>> (04:05:36 PM) jan____: that one
>>
>> In my opinion, it would be a great addition to couchdb. But still, I can't
>> wait to use it on my next project.
>
>
>
>
>>
>>
>>
>> On Wed, Nov 26, 2008 at 3:25 PM, Liam Staskawicz <ls...@gmail.com> wrote:
>>
>>>
>>> On Nov 26, 2008, at 12:20 PM, Noah Slater wrote:
>>>
>>> On Wed, Nov 26, 2008 at 09:57:24AM -0800, Liam Staskawicz wrote:
>>>>
>>>>> When updating a document, is there any notion of submitting a partial
>>>>> update?
>>>>> It seems like being able to specify that only some subset of the fields
>>>>> in a
>>>>> document should be updated would offer some efficiency benefits.  I
>>>>> guess
>>>>> I
>>>>> had in mind some scenario where CouchDB would create the updated record
>>>>> by
>>>>> merging the existing revision with the new info and saving the updated
>>>>> revision, but I'm still new to CouchDB so I don't have a good sense of
>>>>> whether
>>>>> this tramples on any important concepts.
>>>>>
>>>>
>>>> Nope, CouchDB does not support this at the moment. If you want to make
>>>> an
>>>> update
>>>> you have to send the entire document each time.
>>>>
>>>> There is some discussion among CouchDB users and developers about the
>>>> benefits
>>>> of partial updates but it seems the real sticking point so far is
>>>> deciding
>>>> on
>>>> the mechanism for enabling this. It seems the rough consensus at this
>>>> point is
>>>> that whatever method we use be something that is standardised, either
>>>> through a
>>>> standards body or de facto within the larger JSON community.
>>>>
>>>
>>> Thanks for the response - and yeah, this is not a sticking point at the
>>> moment but as systems start to ramp up this seems like a pretty good way
>>> to
>>> make the back and forths much more efficient.   Will be looking forward
>>> to
>>> this being introduced at some point.
>>>
>>> Liam
>>>
>
>



-- 
Chris Anderson
http://jchris.mfdz.com

Re: partial/diff updates?

Posted by Jan Lehnardt <ja...@apache.org>.
On 26 Nov 2008, at 12:40, Dan wrote:

> Here is a conversation I had on the IRC channel #couchdb on this  
> subject on
> november 24, 2008 (2 days ago). Hope this helps!

I kinda don't like IRC quotes floating around, but hey, I didn't put  
up any
disclaimers either. Take the following with a grain of salt :)


> (04:01:26 PM) dsimard: I just  wanted to know if an attachment  
> changes, will
> the new revision contain just the "diff" with the old attachment or  
> the
> complete attachment?
> (04:01:49 PM) jan____: complete attachment. diffs are the devil
> (04:03:08 PM) dsimard: damn... all fields of a document are stored  
> as a full
> document?
> (04:03:18 PM) dsimard: I really thought that diffs were used
> (04:03:35 PM) jan____: no, no diffs. diffs are the devil
> (04:04:14 PM) dsimard: ok, could you elaborate on the evilness of  
> diffs?
> (04:04:44 PM) dsimard: I just want to know more about it
> (04:05:04 PM) jan____: dsimard: you need to keep diffs around  
> forever to
> construct the latest live doc. this totally conflicts with the couchdb
> storage model which uses full representations of each revision.
> (04:05:04 PM) dsimard: or if you have a good link about it
> (04:05:35 PM) jan____:
> http://incubator.apache.org/couchdb/docs/overview.html
> (04:05:36 PM) jan____: that one
>
> In my opinion, it would be a great addition to couchdb. But still, I  
> can't
> wait to use it on my next project.




>
>
>
> On Wed, Nov 26, 2008 at 3:25 PM, Liam Staskawicz <ls...@gmail.com>  
> wrote:
>
>>
>> On Nov 26, 2008, at 12:20 PM, Noah Slater wrote:
>>
>> On Wed, Nov 26, 2008 at 09:57:24AM -0800, Liam Staskawicz wrote:
>>>
>>>> When updating a document, is there any notion of submitting a  
>>>> partial
>>>> update?
>>>> It seems like being able to specify that only some subset of the  
>>>> fields
>>>> in a
>>>> document should be updated would offer some efficiency benefits.   
>>>> I guess
>>>> I
>>>> had in mind some scenario where CouchDB would create the updated  
>>>> record
>>>> by
>>>> merging the existing revision with the new info and saving the  
>>>> updated
>>>> revision, but I'm still new to CouchDB so I don't have a good  
>>>> sense of
>>>> whether
>>>> this tramples on any important concepts.
>>>>
>>>
>>> Nope, CouchDB does not support this at the moment. If you want to  
>>> make an
>>> update
>>> you have to send the entire document each time.
>>>
>>> There is some discussion among CouchDB users and developers about  
>>> the
>>> benefits
>>> of partial updates but it seems the real sticking point so far is  
>>> deciding
>>> on
>>> the mechanism for enabling this. It seems the rough consensus at  
>>> this
>>> point is
>>> that whatever method we use be something that is standardised,  
>>> either
>>> through a
>>> standards body or de facto within the larger JSON community.
>>>
>>
>> Thanks for the response - and yeah, this is not a sticking point at  
>> the
>> moment but as systems start to ramp up this seems like a pretty  
>> good way to
>> make the back and forths much more efficient.   Will be looking  
>> forward to
>> this being introduced at some point.
>>
>> Liam
>>


Re: partial/diff updates?

Posted by Dan <ds...@gmail.com>.
Here is a conversation I had on the IRC channel #couchdb on this subject on
november 24, 2008 (2 days ago). Hope this helps!

(04:01:26 PM) dsimard: I just  wanted to know if an attachment changes, will
the new revision contain just the "diff" with the old attachment or the
complete attachment?
(04:01:49 PM) jan____: complete attachment. diffs are the devil
(04:03:08 PM) dsimard: damn... all fields of a document are stored as a full
document?
(04:03:18 PM) dsimard: I really thought that diffs were used
(04:03:35 PM) jan____: no, no diffs. diffs are the devil
(04:04:14 PM) dsimard: ok, could you elaborate on the evilness of diffs?
(04:04:44 PM) dsimard: I just want to know more about it
(04:05:04 PM) jan____: dsimard: you need to keep diffs around forever to
construct the latest live doc. this totally conflicts with the couchdb
storage model which uses full representations of each revision.
(04:05:04 PM) dsimard: or if you have a good link about it
(04:05:35 PM) jan____:
http://incubator.apache.org/couchdb/docs/overview.html
(04:05:36 PM) jan____: that one

In my opinion, it would be a great addition to couchdb. But still, I can't
wait to use it on my next project.


On Wed, Nov 26, 2008 at 3:25 PM, Liam Staskawicz <ls...@gmail.com> wrote:

>
> On Nov 26, 2008, at 12:20 PM, Noah Slater wrote:
>
>  On Wed, Nov 26, 2008 at 09:57:24AM -0800, Liam Staskawicz wrote:
>>
>>> When updating a document, is there any notion of submitting a partial
>>> update?
>>> It seems like being able to specify that only some subset of the fields
>>> in a
>>> document should be updated would offer some efficiency benefits.  I guess
>>> I
>>> had in mind some scenario where CouchDB would create the updated record
>>> by
>>> merging the existing revision with the new info and saving the updated
>>> revision, but I'm still new to CouchDB so I don't have a good sense of
>>> whether
>>> this tramples on any important concepts.
>>>
>>
>> Nope, CouchDB does not support this at the moment. If you want to make an
>> update
>> you have to send the entire document each time.
>>
>> There is some discussion among CouchDB users and developers about the
>> benefits
>> of partial updates but it seems the real sticking point so far is deciding
>> on
>> the mechanism for enabling this. It seems the rough consensus at this
>> point is
>> that whatever method we use be something that is standardised, either
>> through a
>> standards body or de facto within the larger JSON community.
>>
>
> Thanks for the response - and yeah, this is not a sticking point at the
> moment but as systems start to ramp up this seems like a pretty good way to
> make the back and forths much more efficient.   Will be looking forward to
> this being introduced at some point.
>
> Liam
>

Re: partial/diff updates?

Posted by Liam Staskawicz <ls...@gmail.com>.
On Nov 26, 2008, at 12:20 PM, Noah Slater wrote:

> On Wed, Nov 26, 2008 at 09:57:24AM -0800, Liam Staskawicz wrote:
>> When updating a document, is there any notion of submitting a  
>> partial update?
>> It seems like being able to specify that only some subset of the  
>> fields in a
>> document should be updated would offer some efficiency benefits.  I  
>> guess I
>> had in mind some scenario where CouchDB would create the updated  
>> record by
>> merging the existing revision with the new info and saving the  
>> updated
>> revision, but I'm still new to CouchDB so I don't have a good sense  
>> of whether
>> this tramples on any important concepts.
>
> Nope, CouchDB does not support this at the moment. If you want to  
> make an update
> you have to send the entire document each time.
>
> There is some discussion among CouchDB users and developers about  
> the benefits
> of partial updates but it seems the real sticking point so far is  
> deciding on
> the mechanism for enabling this. It seems the rough consensus at  
> this point is
> that whatever method we use be something that is standardised,  
> either through a
> standards body or de facto within the larger JSON community.

Thanks for the response - and yeah, this is not a sticking point at  
the moment but as systems start to ramp up this seems like a pretty  
good way to make the back and forths much more efficient.   Will be  
looking forward to this being introduced at some point.

Liam

Re: partial/diff updates?

Posted by Noah Slater <ns...@apache.org>.
On Wed, Nov 26, 2008 at 09:57:24AM -0800, Liam Staskawicz wrote:
> When updating a document, is there any notion of submitting a partial update?
> It seems like being able to specify that only some subset of the fields in a
> document should be updated would offer some efficiency benefits.  I guess I
> had in mind some scenario where CouchDB would create the updated record by
> merging the existing revision with the new info and saving the updated
> revision, but I'm still new to CouchDB so I don't have a good sense of whether
> this tramples on any important concepts.

Nope, CouchDB does not support this at the moment. If you want to make an update
you have to send the entire document each time.

There is some discussion among CouchDB users and developers about the benefits
of partial updates but it seems the real sticking point so far is deciding on
the mechanism for enabling this. It seems the rough consensus at this point is
that whatever method we use be something that is standardised, either through a
standards body or de facto within the larger JSON community.

Hope this helps,

-- 
Noah Slater, http://tumbolia.org/nslater

Re: partial/diff updates?

Posted by Antony Blakey <an...@gmail.com>.
Have a look at the archive for this list - the thread 'Document  
Updates' contains some pertinent discussion.

Antony Blakey
--------------------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

Always have a vision. Why spend your life making other people’s dreams?
  -- Orson Welles (1915-1985)


Re: partial/diff updates?

Posted by Paul Carey <pa...@gmail.com>.
> If I could do partial updates, I would store all the user_ids inside a
> field of a group doc such as:
>
> "memberships": [user_id1, user_id2, etc]
>
> I could do this right now, but what if I want to update or delete a
> membership? If a group has a million users

This model suggests that you would effectively be allowing a million
users to write to a single document, which is probably unwise if
contention-free writes are a goal of your application.

A better approach might be to store the group membership in each user
doc. A view could emit the group_id key against the user doc - this
would provide roughly the same data as a group doc containing an array
of user ids.

Paul

partial/diff updates?

Posted by Patrick Aljord <pa...@gmail.com>.
On Thu, Nov 27, 2008 at 1:12 PM, Jan Lehnardt <ja...@apache.org> wrote:

> The problem is that this would be a synthetic benchmark. I'd still very
> much like to
> see a real world application that shows in numbers that it needs partial
> updates.


Partial updates would be good in the user/group/membership situation. Right
now, I have to create a membership doc to store the user_id and group_id in
it. If I could do partial updates, I would store all the user_ids inside a
field of a group doc such as:

"memberships": [user_id1, user_id2, etc]

I could do this right now, but what if I want to update or delete a
membership? If a group has a million users, I'd need to post the whole doc
and that would definitely not scale compared to a partial update that would
just send a little diff IMO.

PS: I'm resending this mail to user@couchdb.apache.org

Re: partial/diff updates?

Posted by Patrick Aljord <pa...@gmail.com>.
On Thu, Nov 27, 2008 at 1:12 PM, Jan Lehnardt <ja...@apache.org> wrote:

> The problem is that this would be a synthetic benchmark. I'd still very
> much like to
> see a real world application that shows in numbers that it needs partial
> updates.


Partial updates would be good in the user/group/membership situation. Right
now, I have to create a membership doc to store the user_id and group_id in
it. If I could do partial updates, I would store all the user_ids inside a
field of a group doc such as:

"memberships": [user_id1, user_id2, etc]

I could do this right now, but what if I want to update or delete a
membership? If a group has a million users, I'd need to post the whole doc
and that would definitely not scale compared to a partial update that would
just send a little diff IMO.

Re: partial/diff updates?

Posted by Jan Lehnardt <ja...@apache.org>.
On 27 Nov 2008, at 09:53, Burobjorn wrote:

> I like this approach.
>
> This also creates the possibility to benchmark and compare the diff
> methodology versus a full update.

The problem is that this would be a synthetic benchmark. I'd still  
very much like to
see a real world application that shows in numbers that it needs  
partial updates.
Until then it is just premature optimization. A neat feature for sure,  
but needed,
nobody knows.

Jan
--


> Would it be possible then to use
> 'standard' text diffs as well, without taking into account that we're
> dealing with json?
>
> All the best,
> grtz
> BjornW
>
>
> * b u r o b j o r n .nl *
> digitaal vakmanschap | digital craftsmanship
>
> Concordiastraat 68-126
> 3551 EM Utrecht
> The Netherlands
>
> phone: +31 6 49 74 78 70
> http://www.burobjorn.nl
>
>
> Timo Isokoski wrote:
>> Is this talk about the "diff" feature related to
>>
>> a) How CouchDB physically stores the data on disk
>> b) How data is transmitted between the client and CouchDB
>>
>> In case a) I think diffs are the devil and it goes aganist the  
>> simplicity of
>> CouchDB:s inner workings. In case b), wouldn't it be easy to  
>> implement some
>> kind of a prototype of this feature as a "proxy server" on top of  
>> CouchDB.
>> The proxy could route the normal requests directly to CouchDB and  
>> the actual
>> diff requests could be handled like this:
>> 1. GET the original document from Couch
>> 2. Apply diff
>> 3. PUT the modified document back to the Couch
>>
>> The functionality can then be integrated into CouchDB inself if the
>> prototype works well and and people start using it.
>>
>>
>> -Timo
>>
>>
>>
>> 2008/11/27 Antony Blakey <an...@gmail.com>
>>
>>> On 27/11/2008, at 10:10 PM, Noah Slater wrote:
>>>
>>> On Thu, Nov 27, 2008 at 08:45:18PM +1030, Antony Blakey wrote:
>>>>> * JPath its self is a nebulous concept.
>>>>> In what sense do you think the concept is nebulous?
>>>>>
>>>> It lacks an RFC. :)
>>>>
>>> I didn't realize that JSON had an RFC! Now that I've read it, I  
>>> think that
>>> this statement:
>>> "A JSON text is a serialized object or array."
>>> which dominates this subsequent statement:
>>> "The names within an object SHOULD be unique."
>>> clearly resolves the ambiguity discussed in a previous thread  
>>> regarding
>>> duplicate hash keys, in the manner that I suggested. Namely,  
>>> duplicate keys
>>> are not allowed because they cannot be the result of serializing a
>>> javascript object. It specifically defines a JSON *text*, so model
>>> equivalence isn't sufficient.
>>> Given that JPath is a subset of javascript access path syntax and
>>> semantics, would a definition that references the appropriate ECMA  
>>> clauses
>>> meet with your approval? Or is this issue blocked IYO until a full  
>>> JSON
>>> transformation/mutation/update RFC is approved (whatever approval  
>>> means).
>>> Antony Blakey
>>> -------------
>>> CTO, Linkuistics Pty Ltd
>>> Ph: 0438 840 787
>>>
>>> There are two ways of constructing a software design: One way is  
>>> to make it
>>> so simple that there are obviously no deficiencies, and the other  
>>> way is to
>>> make it so complicated that there are no obvious deficiencies.
>>> -- C. A. R. Hoare
>>>
>>>
>>>
>>
>>
>


Re: partial/diff updates?

Posted by Burobjorn <bu...@gmail.com>.
I like this approach.

This also creates the possibility to benchmark and compare the diff
methodology versus a full update. Would it be possible then to use
'standard' text diffs as well, without taking into account that we're
dealing with json?

All the best,
grtz
BjornW


* b u r o b j o r n .nl *
digitaal vakmanschap | digital craftsmanship

Concordiastraat 68-126
3551 EM Utrecht
The Netherlands

phone: +31 6 49 74 78 70
http://www.burobjorn.nl


Timo Isokoski wrote:
> Is this talk about the "diff" feature related to
> 
> a) How CouchDB physically stores the data on disk
> b) How data is transmitted between the client and CouchDB
> 
> In case a) I think diffs are the devil and it goes aganist the simplicity of
> CouchDB:s inner workings. In case b), wouldn't it be easy to implement some
> kind of a prototype of this feature as a "proxy server" on top of CouchDB.
> The proxy could route the normal requests directly to CouchDB and the actual
> diff requests could be handled like this:
> 1. GET the original document from Couch
> 2. Apply diff
> 3. PUT the modified document back to the Couch
> 
> The functionality can then be integrated into CouchDB inself if the
> prototype works well and and people start using it.
> 
> 
> -Timo
> 
> 
> 
> 2008/11/27 Antony Blakey <an...@gmail.com>
> 
>> On 27/11/2008, at 10:10 PM, Noah Slater wrote:
>>
>>  On Thu, Nov 27, 2008 at 08:45:18PM +1030, Antony Blakey wrote:
>>>> * JPath its self is a nebulous concept.
>>>> In what sense do you think the concept is nebulous?
>>>>
>>> It lacks an RFC. :)
>>>
>> I didn't realize that JSON had an RFC! Now that I've read it, I think that
>> this statement:
>>  "A JSON text is a serialized object or array."
>> which dominates this subsequent statement:
>>  "The names within an object SHOULD be unique."
>> clearly resolves the ambiguity discussed in a previous thread regarding
>> duplicate hash keys, in the manner that I suggested. Namely, duplicate keys
>> are not allowed because they cannot be the result of serializing a
>> javascript object. It specifically defines a JSON *text*, so model
>> equivalence isn't sufficient.
>> Given that JPath is a subset of javascript access path syntax and
>> semantics, would a definition that references the appropriate ECMA clauses
>> meet with your approval? Or is this issue blocked IYO until a full JSON
>> transformation/mutation/update RFC is approved (whatever approval means).
>> Antony Blakey
>> -------------
>> CTO, Linkuistics Pty Ltd
>> Ph: 0438 840 787
>>
>> There are two ways of constructing a software design: One way is to make it
>> so simple that there are obviously no deficiencies, and the other way is to
>> make it so complicated that there are no obvious deficiencies.
>>  -- C. A. R. Hoare
>>
>>
>>
> 
> 

Re: partial/diff updates?

Posted by Paul Davis <pa...@gmail.com>.
On Fri, Nov 28, 2008 at 7:08 PM, Noah Slater <ns...@apache.org> wrote:
> On Sat, Nov 29, 2008 at 11:15:17AM +1030, Antony Blakey wrote:
>> First though, who is actually interested in working on it?
>
> I am happy to contribute to the spec writing.
>

I'd help on this front.

> --
> Noah Slater, http://tumbolia.org/nslater
>

Re: partial/diff updates?

Posted by Noah Slater <ns...@apache.org>.
On Sat, Nov 29, 2008 at 11:15:17AM +1030, Antony Blakey wrote:
> First though, who is actually interested in working on it?

I am happy to contribute to the spec writing.

-- 
Noah Slater, http://tumbolia.org/nslater

Re: partial/diff updates?

Posted by Antony Blakey <an...@gmail.com>.
On 28/11/2008, at 8:57 PM, Noah Slater wrote:

> So, how do we get started?

Logistically, I'm not sure.

Procedurally I propose an informal spec, followed by a proof-of- 
concept implementation as a Couch branch and a JS implementation,  
feedback/analysis/improvement of those, and then formalization as an  
RFC.

First though, who is actually interested in working on it?

Antony Blakey
--------------------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

Always have a vision. Why spend your life making other people’s dreams?
  -- Orson Welles (1915-1985)


Re: partial/diff updates?

Posted by Noah Slater <ns...@apache.org>.
On Fri, Nov 28, 2008 at 08:34:04PM +1030, Antony Blakey wrote:
> Well, how about we define one, and submit it as a RFC. It doesn't have to go
> through the JSON community. If a different community wants to extend that RFC,
> allowing for profiles e.g. JSONDiff profile 0 => no expressions, JSONDiff
> profile 1 => assume a javascript interpreter.

Okay, sure. If the wider JSON community wants to get involved, great!

I am guessing one of the first things we should do is take this elsewhere:

  http://groups.google.com/group/json-diff

So, how do we get started?

Best,

-- 
Noah Slater, http://tumbolia.org/nslater

Re: partial/diff updates?

Posted by Antony Blakey <an...@gmail.com>.
On 28/11/2008, at 5:02 PM, Paul Davis wrote:

> On Fri, Nov 28, 2008 at 12:13 AM, Antony Blakey <antony.blakey@gmail.com 
> > wrote:
>>
>> On 28/11/2008, at 4:08 PM, Paul Davis wrote:
>>
>>> Any JSON diff format is going to require the JSON RFC's 'SHOULD' to
>>> either be changed to a 'MAY' or 'MUST'. The ambiguity in 'SHOULD' is
>>> going to cause problems.
>>
>> That "SHOULD" is dominated by the preceding statement that "A JSON  
>> text is a
>> serialized object or array", so it is effectively a "MUST" because  
>> the spec
>> doesn't allow for any other possibility. IMO the JSON spec is not  
>> ambiguous
>> in respect of this issue.
>>
>
> I fail to see how "A JSON text is a serialized object or array" in any
> way dictates that object member names must be unique. Obviously to see
> such an interpretation we have to ignore 99% of the current JSON
> implementations that assume JSON object member names are unique, but
> still, a strict reading of the spec allows non unique member names.
> And as I said, the current JSON dserializer that couchdb uses allows
> for non-unique member names (but only ever operates on the first).

Because of this:

"The terms "object" and "array" come from the conventions of  
JavaScript."

Javascript objects have unique field names. Combine that with this:

"A JSON text is a serialized object or array."

and it means that a valid JSON text IS the serialization of a  
javascript object or array, and hence cannot contain hashes with  
duplicate names. It is not that a duplicate name is allowed but  
resolved according to some operational convention (first or last name  
wins), but rather that a JSON text that contains a hash with a  
duplicate name is not a valid JSON document because it cannot be the  
serialization of a javascript object.

>> For sure. My point is that a JSON-community driven diff might  
>> presume that
>> Javascript expressions (as in JSONPath) are OK, and hence assume  
>> you would
>> use "@.length - 1". The context of the JSON community is not  
>> necessarily
>> Couch's context.
>>
>
> Here we're totally in agreement. We should yell and scream at the JSON
> community for being shortsighted int their ideas of requiring a full
> JavaScript engine to evaluate their RFC's. That's not to say we don't
> want them on our side though. Perhaps it should be more of a subtle
> coup d'état in how we approach making a diff spec.

Well, how about we define one, and submit it as a RFC. It doesn't have  
to go through the JSON community. If a different community wants to  
extend that RFC, allowing for profiles e.g. JSONDiff profile 0 => no  
expressions, JSONDiff profile 1 => assume a javascript interpreter.

Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

Did you hear about the Buddhist who refused Novocain during a root  
canal?
His goal: transcend dental medication.



Re: partial/diff updates?

Posted by Paul Davis <pa...@gmail.com>.
On Fri, Nov 28, 2008 at 12:13 AM, Antony Blakey <an...@gmail.com> wrote:
>
> On 28/11/2008, at 4:08 PM, Paul Davis wrote:
>
>> Any JSON diff format is going to require the JSON RFC's 'SHOULD' to
>> either be changed to a 'MAY' or 'MUST'. The ambiguity in 'SHOULD' is
>> going to cause problems.
>
> That "SHOULD" is dominated by the preceding statement that "A JSON text is a
> serialized object or array", so it is effectively a "MUST" because the spec
> doesn't allow for any other possibility. IMO the JSON spec is not ambiguous
> in respect of this issue.
>

I fail to see how "A JSON text is a serialized object or array" in any
way dictates that object member names must be unique. Obviously to see
such an interpretation we have to ignore 99% of the current JSON
implementations that assume JSON object member names are unique, but
still, a strict reading of the spec allows non unique member names.
And as I said, the current JSON dserializer that couchdb uses allows
for non-unique member names (but only ever operates on the first).

>> Or we could use a array[-1] syntax like some languages.
>
> For sure. My point is that a JSON-community driven diff might presume that
> Javascript expressions (as in JSONPath) are OK, and hence assume you would
> use "@.length - 1". The context of the JSON community is not necessarily
> Couch's context.
>

Here we're totally in agreement. We should yell and scream at the JSON
community for being shortsighted int their ideas of requiring a full
JavaScript engine to evaluate their RFC's. That's not to say we don't
want them on our side though. Perhaps it should be more of a subtle
coup d'état in how we approach making a diff spec.

> Antony Blakey
> -------------
> CTO, Linkuistics Pty Ltd
> Ph: 0438 840 787
>
> There is nothing more difficult to plan, more doubtful of success, nor more
> dangerous to manage than the creation of a new order of things... Whenever
> his enemies have the ability to attack the innovator, they do so with the
> passion of partisans, while the others defend him sluggishly, So that the
> innovator and his party alike are vulnerable.
>  -- Niccolo Machiavelli, 1513, The Prince.
>
>
>

Re: partial/diff updates?

Posted by Antony Blakey <an...@gmail.com>.
On 28/11/2008, at 4:08 PM, Paul Davis wrote:

> Any JSON diff format is going to require the JSON RFC's 'SHOULD' to
> either be changed to a 'MAY' or 'MUST'. The ambiguity in 'SHOULD' is
> going to cause problems.

That "SHOULD" is dominated by the preceding statement that "A JSON  
text is a serialized object or array", so it is effectively a "MUST"  
because the spec doesn't allow for any other possibility. IMO the JSON  
spec is not ambiguous in respect of this issue.

> Or we could use a array[-1] syntax like some languages.

For sure. My point is that a JSON-community driven diff might presume  
that Javascript expressions (as in JSONPath) are OK, and hence assume  
you would use "@.length - 1". The context of the JSON community is not  
necessarily Couch's context.

Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

There is nothing more difficult to plan, more doubtful of success, nor  
more dangerous to manage than the creation of a new order of things...  
Whenever his enemies have the ability to attack the innovator, they do  
so with the passion of partisans, while the others defend him  
sluggishly, So that the innovator and his party alike are vulnerable.
   -- Niccolo Machiavelli, 1513, The Prince.



Re: partial/diff updates?

Posted by Paul Davis <pa...@gmail.com>.
On Thu, Nov 27, 2008 at 10:55 PM, Antony Blakey <an...@gmail.com> wrote:
>
> On 28/11/2008, at 2:02 PM, Paul Davis wrote:
>
>> I think what Noah might be saying is, "As soon as there's a JSON diff
>> RFC, we'll implement it". Which I agree with completely. Until then,
>> if we implement something it'd most likely be not the RFC which we've
>> all had to deal with when coding to web 'standards'. It's not fun.
>>
>> That said, pushing the JSON community towards acceptance of a diff
>> format is something we could do. Not sure how we should organize other
>> than all joining the JSON lists and pushing. And then nominating Noah
>> to write the RFC that we'll implement. Matter of fact I kinda like
>> that idea. Anyone with me?
>

It's a holiday and I've had beers so I'm taking the lazy method of not
Googling and refuting sentence by sentence

> All IMO (of course) ...
>
> There's no guarantee that a JSON diff RFC will happen in any reasonable time
> frame if it doesn't arise from this group, which has a specific need in
> mind. And I've had plenty of problems coding to web standards that are
> RFC's. Even when there is an RFC there is no guarantee that it is
> unambiguous or correctly implemented - witness the recent discussion about
> duplicate names in JSON hashes.
>

Any JSON diff format is going to require the JSON RFC's 'SHOULD' to
either be changed to a 'MAY' or 'MUST'. The ambiguity in 'SHOULD' is
going to cause problems.

> Furthermore, there's no guarantee that a proposal from the JSON community
> will be appropriate. Consider the current JSONPath proposal linked from
> www.json.com (acknowledging that JSONPath might not be a component of a diff
> spec). It presumes a script engine, and IMO only an explicitly declarative
> model without the need for expression evaluation would be suitable for Couch
> (which doesn't mandate a JS engine).

I agree entirely. The 'SHOULD' would've been much better as a 'MUST'.

 As an example, consider a reference to
> the last element of an array, which in the current proposal is like this:
> "$.store.book[(@.length-1)]". You could use an explicit index, but that
> presumes you have the entire structure of interest. I can imagine wanting to
> generate a stream of partial updates that append to a list, where you don't
> want the client to have to track the entire list structure.

Or we could use a array[-1] syntax like some languages. Tracking the
entire list is pretty much out of the question. Take it for granted
that CouchDB is going to store entire documents. Any diff format that
is used will only be an optimization for editing documents.

One response
> might be 'write them as separate documents', but then you are forced to
> fake-join/merge in your view, which can get very complicated, and run into
> intermediate documents that expand during (re)reduction.
>

I'm not sure what you mean by this.

> Finally, why does this have to be driven through the 'JSON community'. Given
> that Couch is an *alpha* product, this is a good time to implement
> something, which IMO is the best way to prove a particular model. Why not
> just implement something to get a feel for suitability? We don't have to
> push anyone, we just have to do it, surely? And yes, I understand the
> inherent irony of that statement :)
>

In my mind there are two methods in this scenario. The IE way, and the
not IE way. IE coded to their whims and desires. I would be very
unsettled if CouchDB took it upon itself to implement a custom spec of
JSON diff without input from the community. You get into the whole
support/breaking changes issues that suck.

As Jan mentioned, there's no actual proof this is even needed yet
beyond most of us *thinking* it would probably be a good thing.

> Antony Blakey
> -------------
> CTO, Linkuistics Pty Ltd
> Ph: 0438 840 787
>
> Every task involves constraint,
> Solve the thing without complaint;
> There are magic links and chains
> Forged to loose our rigid brains.
> Structures, structures, though they bind,
> Strangely liberate the mind.
>  -- James Fallen
>
>
>

Re: partial/diff updates?

Posted by Jan Lehnardt <ja...@apache.org>.
On 27 Nov 2008, at 20:55, Antony Blakey wrote:

>
> On 28/11/2008, at 2:02 PM, Paul Davis wrote:
>
>> I think what Noah might be saying is, "As soon as there's a JSON diff
>> RFC, we'll implement it". Which I agree with completely. Until then,
>> if we implement something it'd most likely be not the RFC which we've
>> all had to deal with when coding to web 'standards'. It's not fun.
>>
>> That said, pushing the JSON community towards acceptance of a diff
>> format is something we could do. Not sure how we should organize  
>> other
>> than all joining the JSON lists and pushing. And then nominating Noah
>> to write the RFC that we'll implement. Matter of fact I kinda like
>> that idea. Anyone with me?
>
> All IMO (of course) ...
>
> There's no guarantee that a JSON diff RFC will happen in any  
> reasonable time frame if it doesn't arise from this group, which has  
> a specific need in mind. And I've had plenty of problems coding to  
> web standards that are RFC's. Even when there is an RFC there is no  
> guarantee that it is unambiguous or correctly implemented - witness  
> the recent discussion about duplicate names in JSON hashes.
>
> Furthermore, there's no guarantee that a proposal from the JSON  
> community will be appropriate. Consider the current JSONPath  
> proposal linked from www.json.com (acknowledging that JSONPath might  
> not be a component of a diff spec). It presumes a script engine, and  
> IMO only an explicitly declarative model without the need for  
> expression evaluation would be suitable for Couch (which doesn't  
> mandate a JS engine). As an example, consider a reference to the  
> last element of an array, which in the current proposal is like  
> this: "$.store.book[(@.length-1)]". You could use an explicit index,  
> but that presumes you have the entire structure of interest. I can  
> imagine wanting to generate a stream of partial updates that append  
> to a list, where you don't want the client to have to track the  
> entire list structure. One response might be 'write them as separate  
> documents', but then you are forced to fake-join/merge in your view,  
> which can get very complicated, and run into intermediate documents  
> that expand during (re)reduction.
>
> Finally, why does this have to be driven through the 'JSON  
> community'. Given that Couch is an *alpha* product, this is a good  
> time to implement something, which IMO is the best way to prove a  
> particular model. Why not just implement something to get a feel for  
> suitability? We don't have to push anyone, we just have to do it,  
> surely? And yes, I understand the inherent irony of that statement :)

I feel that there are far more important things that we should get out  
for 0.9 and 1.0 before looking at partial updates. :)

Cheers
Jan
--

Re: partial/diff updates?

Posted by Antony Blakey <an...@gmail.com>.
On 28/11/2008, at 2:02 PM, Paul Davis wrote:

> I think what Noah might be saying is, "As soon as there's a JSON diff
> RFC, we'll implement it". Which I agree with completely. Until then,
> if we implement something it'd most likely be not the RFC which we've
> all had to deal with when coding to web 'standards'. It's not fun.
>
> That said, pushing the JSON community towards acceptance of a diff
> format is something we could do. Not sure how we should organize other
> than all joining the JSON lists and pushing. And then nominating Noah
> to write the RFC that we'll implement. Matter of fact I kinda like
> that idea. Anyone with me?

All IMO (of course) ...

There's no guarantee that a JSON diff RFC will happen in any  
reasonable time frame if it doesn't arise from this group, which has a  
specific need in mind. And I've had plenty of problems coding to web  
standards that are RFC's. Even when there is an RFC there is no  
guarantee that it is unambiguous or correctly implemented - witness  
the recent discussion about duplicate names in JSON hashes.

Furthermore, there's no guarantee that a proposal from the JSON  
community will be appropriate. Consider the current JSONPath proposal  
linked from www.json.com (acknowledging that JSONPath might not be a  
component of a diff spec). It presumes a script engine, and IMO only  
an explicitly declarative model without the need for expression  
evaluation would be suitable for Couch (which doesn't mandate a JS  
engine). As an example, consider a reference to the last element of an  
array, which in the current proposal is like this:  
"$.store.book[(@.length-1)]". You could use an explicit index, but  
that presumes you have the entire structure of interest. I can imagine  
wanting to generate a stream of partial updates that append to a list,  
where you don't want the client to have to track the entire list  
structure. One response might be 'write them as separate documents',  
but then you are forced to fake-join/merge in your view, which can get  
very complicated, and run into intermediate documents that expand  
during (re)reduction.

Finally, why does this have to be driven through the 'JSON community'.  
Given that Couch is an *alpha* product, this is a good time to  
implement something, which IMO is the best way to prove a particular  
model. Why not just implement something to get a feel for suitability?  
We don't have to push anyone, we just have to do it, surely? And yes,  
I understand the inherent irony of that statement :)

Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

Every task involves constraint,
Solve the thing without complaint;
There are magic links and chains
Forged to loose our rigid brains.
Structures, structures, though they bind,
Strangely liberate the mind.
   -- James Fallen



Re: partial/diff updates?

Posted by Paul Davis <pa...@gmail.com>.
On Thu, Nov 27, 2008 at 7:47 PM, Noah Slater <ns...@apache.org> wrote:
> On Fri, Nov 28, 2008 at 11:31:49AM +1030, Antony Blakey wrote:
>>
>> On 28/11/2008, at 10:36 AM, Chris Anderson wrote:
>>
>>> The question becomes why not just use seperate docs?
>>
>> Because the difficulty of doing real joins in a map/reduce framework
>> leads to larger documents that are destructured into smaller fragments
> ...
>
> I too do not buy this argument. It's like saying that the answer to scaling is
> to build smaller websites. There a instances where documents will be large
> enough that sending partial updates will be beneficial and I think the sooner we
> embrace this fact this the better. The only problem we face is that of
> standardisation, everything else is a distraction from the real issue.
>
> --
> Noah Slater, http://tumbolia.org/nslater
>

I think what Noah might be saying is, "As soon as there's a JSON diff
RFC, we'll implement it". Which I agree with completely. Until then,
if we implement something it'd most likely be not the RFC which we've
all had to deal with when coding to web 'standards'. It's not fun.

That said, pushing the JSON community towards acceptance of a diff
format is something we could do. Not sure how we should organize other
than all joining the JSON lists and pushing. And then nominating Noah
to write the RFC that we'll implement. Matter of fact I kinda like
that idea. Anyone with me?

Paul

Re: partial/diff updates?

Posted by Noah Slater <ns...@apache.org>.
On Fri, Nov 28, 2008 at 11:31:49AM +1030, Antony Blakey wrote:
>
> On 28/11/2008, at 10:36 AM, Chris Anderson wrote:
>
>> The question becomes why not just use seperate docs?
>
> Because the difficulty of doing real joins in a map/reduce framework
> leads to larger documents that are destructured into smaller fragments
...

I too do not buy this argument. It's like saying that the answer to scaling is
to build smaller websites. There a instances where documents will be large
enough that sending partial updates will be beneficial and I think the sooner we
embrace this fact this the better. The only problem we face is that of
standardisation, everything else is a distraction from the real issue.

-- 
Noah Slater, http://tumbolia.org/nslater

Re: partial/diff updates?

Posted by Antony Blakey <an...@gmail.com>.
On 28/11/2008, at 10:36 AM, Chris Anderson wrote:

> The question becomes why not just use seperate docs?

Because the difficulty of doing real joins in a map/reduce framework  
leads to larger documents that are destructured into smaller fragments  
(views) using map/reduce, as opposed to the RDBM philosophy of using  
small normalized fragments that are combined into larger results by  
query joins.

This is how I have come to think of CouchDB (and I think it's an  
aesthetic conceptualization), hence my interest in partial updates,  
which IMO are one requirement to round out CouchDB's applicability.

I guess partial gets are actually views, so that's something of a  
furphy, although it's expensive to maintain access-path equivalence  
between a view and it's source document because it requires either  
awareness on the client of the mapping from view to original document  
structure, or the full document needs to be included in the view,  
which reduces views to indexes.

Antony Blakey
--------------------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

75% of statistics are made up on the spot.



Re: partial/diff updates?

Posted by Chris Anderson <jc...@gmail.com>.
The question becomes why not just use seperate docs?

Sent from my iPhone

On Nov 27, 2008, at 1:38 PM, Antony Blakey <an...@gmail.com>  
wrote:

>
> On 28/11/2008, at 3:29 AM, Timo Isokoski wrote:
>
>> Is this talk about the "diff" feature related to
>>
>> a) How CouchDB physically stores the data on disk
>> b) How data is transmitted between the client and CouchDB
>>
>> In case a) I think diffs are the devil and it goes aganist the  
>> simplicity of
>> CouchDB:s inner workings. In case b), wouldn't it be easy to  
>> implement some
>> kind of a prototype of this feature as a "proxy server" on top of  
>> CouchDB.
>> The proxy could route the normal requests directly to CouchDB and  
>> the actual
>> diff requests could be handled like this:
>> 1. GET the original document from Couch
>> 2. Apply diff
>> 3. PUT the modified document back to the Couch
>>
>> The functionality can then be integrated into CouchDB inself if the
>> prototype works well and and people start using it.
>
> It's case b. Integrating it with CouchDB to test it isn't that  
> difficult - certainly not difficult enough to justify implementing  
> it as a separate proxy, which IMO would involve lots of unnecessary  
> scaffolding.
>
> Those 3 steps are exactly what you would do as a plugin/extension.
>
> I think that any real-world application that justifies partial  
> updates probably also justifies partial gets, and not just a single  
> access path selection.
>
> Antony Blakey
> -------------
> CTO, Linkuistics Pty Ltd
> Ph: 0438 840 787
>
> There are two ways of constructing a software design: One way is to  
> make it so simple that there are obviously no deficiencies, and the  
> other way is to make it so complicated that there are no obvious  
> deficiencies.
>  -- C. A. R. Hoare
>
>

Re: partial/diff updates?

Posted by Antony Blakey <an...@gmail.com>.
On 28/11/2008, at 3:29 AM, Timo Isokoski wrote:

> Is this talk about the "diff" feature related to
>
> a) How CouchDB physically stores the data on disk
> b) How data is transmitted between the client and CouchDB
>
> In case a) I think diffs are the devil and it goes aganist the  
> simplicity of
> CouchDB:s inner workings. In case b), wouldn't it be easy to  
> implement some
> kind of a prototype of this feature as a "proxy server" on top of  
> CouchDB.
> The proxy could route the normal requests directly to CouchDB and  
> the actual
> diff requests could be handled like this:
> 1. GET the original document from Couch
> 2. Apply diff
> 3. PUT the modified document back to the Couch
>
> The functionality can then be integrated into CouchDB inself if the
> prototype works well and and people start using it.

It's case b. Integrating it with CouchDB to test it isn't that  
difficult - certainly not difficult enough to justify implementing it  
as a separate proxy, which IMO would involve lots of unnecessary  
scaffolding.

Those 3 steps are exactly what you would do as a plugin/extension.

I think that any real-world application that justifies partial updates  
probably also justifies partial gets, and not just a single access  
path selection.

Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

There are two ways of constructing a software design: One way is to  
make it so simple that there are obviously no deficiencies, and the  
other way is to make it so complicated that there are no obvious  
deficiencies.
   -- C. A. R. Hoare



Re: partial/diff updates?

Posted by Timo Isokoski <ti...@gmail.com>.
Is this talk about the "diff" feature related to

a) How CouchDB physically stores the data on disk
b) How data is transmitted between the client and CouchDB

In case a) I think diffs are the devil and it goes aganist the simplicity of
CouchDB:s inner workings. In case b), wouldn't it be easy to implement some
kind of a prototype of this feature as a "proxy server" on top of CouchDB.
The proxy could route the normal requests directly to CouchDB and the actual
diff requests could be handled like this:
1. GET the original document from Couch
2. Apply diff
3. PUT the modified document back to the Couch

The functionality can then be integrated into CouchDB inself if the
prototype works well and and people start using it.


-Timo



2008/11/27 Antony Blakey <an...@gmail.com>

>
> On 27/11/2008, at 10:10 PM, Noah Slater wrote:
>
>  On Thu, Nov 27, 2008 at 08:45:18PM +1030, Antony Blakey wrote:
>>
>>> * JPath its self is a nebulous concept.
>>>>
>>>
>>> In what sense do you think the concept is nebulous?
>>>
>>
>> It lacks an RFC. :)
>>
>
> I didn't realize that JSON had an RFC! Now that I've read it, I think that
> this statement:
>  "A JSON text is a serialized object or array."
> which dominates this subsequent statement:
>  "The names within an object SHOULD be unique."
> clearly resolves the ambiguity discussed in a previous thread regarding
> duplicate hash keys, in the manner that I suggested. Namely, duplicate keys
> are not allowed because they cannot be the result of serializing a
> javascript object. It specifically defines a JSON *text*, so model
> equivalence isn't sufficient.
> Given that JPath is a subset of javascript access path syntax and
> semantics, would a definition that references the appropriate ECMA clauses
> meet with your approval? Or is this issue blocked IYO until a full JSON
> transformation/mutation/update RFC is approved (whatever approval means).
> Antony Blakey
> -------------
> CTO, Linkuistics Pty Ltd
> Ph: 0438 840 787
>
> There are two ways of constructing a software design: One way is to make it
> so simple that there are obviously no deficiencies, and the other way is to
> make it so complicated that there are no obvious deficiencies.
>  -- C. A. R. Hoare
>
>
>


-- 
Timo Isokoski
+358503054649

Re: partial/diff updates?

Posted by Antony Blakey <an...@gmail.com>.
On 27/11/2008, at 10:10 PM, Noah Slater wrote:

> On Thu, Nov 27, 2008 at 08:45:18PM +1030, Antony Blakey wrote:
>>> * JPath its self is a nebulous concept.
>>
>> In what sense do you think the concept is nebulous?
>
> It lacks an RFC. :)

I didn't realize that JSON had an RFC! Now that I've read it, I think  
that this statement:
   "A JSON text is a serialized object or array."
which dominates this subsequent statement:
   "The names within an object SHOULD be unique."
clearly resolves the ambiguity discussed in a previous thread  
regarding duplicate hash keys, in the manner that I suggested. Namely,  
duplicate keys are not allowed because they cannot be the result of  
serializing a javascript object. It specifically defines a JSON  
*text*, so model equivalence isn't sufficient.
Given that JPath is a subset of javascript access path syntax and  
semantics, would a definition that references the appropriate ECMA  
clauses meet with your approval? Or is this issue blocked IYO until a  
full JSON transformation/mutation/update RFC is approved (whatever  
approval means).
Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

There are two ways of constructing a software design: One way is to  
make it so simple that there are obviously no deficiencies, and the  
other way is to make it so complicated that there are no obvious  
deficiencies.
   -- C. A. R. Hoare



Re: partial/diff updates?

Posted by Noah Slater <ns...@apache.org>.
On Thu, Nov 27, 2008 at 08:45:18PM +1030, Antony Blakey wrote:
>> * JPath its self is a nebulous concept.
>
> In what sense do you think the concept is nebulous?

It lacks an RFC. :)

-- 
Noah Slater, http://tumbolia.org/nslater

Re: partial/diff updates?

Posted by Antony Blakey <an...@gmail.com>.
On 27/11/2008, at 8:28 PM, Noah Slater wrote:

> * JPath its self is a nebulous concept.

In what sense do you think the concept is nebulous?

Antony Blakey
--------------------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

Only two things are infinite, the universe and human stupidity, and  
I'm not sure about the former.
  -- Albert Einstein


Re: partial/diff updates?

Posted by Noah Slater <ns...@apache.org>.
On Thu, Nov 27, 2008 at 01:53:21AM -0500, Jedediah Smith wrote:
> You'd need a few rules for resolving ambiguities and attachments would
> probably have to use a different url (_attachment/file?) but it would be very
> convenient and RESTful.

The way I understand it, you're proposing to operate using HTTP verbs on subsets
of a JSON document using something functionally identical to XPath for XML.

The problems I see with this approach are:

 * JPath its self is a nebulous concept.

 * You're minting new URIs for each JPath expression. Ideally this would be done
   via query strings so that the resource URI maintained its identity.

 * There would be no way to do multiple updates per document, or to operate on
   multiple documents at the same time.

-- 
Noah Slater, http://tumbolia.org/nslater

Re: partial/diff updates?

Posted by Chris Anderson <jc...@apache.org>.
Thanks for bringing this up, Jedediah.



On Wed, Nov 26, 2008 at 10:53 PM, Jedediah Smith
<je...@silencegreys.com> wrote:

> GET /db/abc123/foo/bar/2/baz
>
> 3
>
> PUT /db/abc123/foo/durr [4,5,6]

If we require valid _revs for all PUTs (even to subfields) we'll leave
the document semantics only minimally changed. You can use it to cut
down on transfer size. Even enabling it only for top-level attributes
could be a valuable tool.

for array updates:

{
 "_id": "abc123",
 "foo": { "bar": [1,2,{ "baz": 3 }] }
}

PUT /db/absc123/foo/baz/2 {"replace":2}

GET  /db/absc123

{
 "_id": "abc123",
 "foo": { "bar": [1,2,{ "replace": 2 }] }
}

but that's a bit trickier. just allowing top-level field updates (no
deep nesting) keeps us from worrying about arrays.

Changing a field on a remote document becomes just a GET against
_all_docs with key=_id followed by a PUT to the subfield. No diff
format required.



-- 
Chris Anderson
http://jchris.mfdz.com

Re: partial/diff updates?

Posted by Jedediah Smith <je...@silencegreys.com>.
Has something like this been considered?

{
   "_id": "abc123",
   "foo": { "bar": [1,2,{ "baz": 3 }] }
}


GET /db/abc123/foo/bar/2/baz

3

PUT /db/abc123/foo/durr [4,5,6]

GET /db/abc123

{
   "_id": "abc123",
   "foo": {
     "bar": [1,2,{ "baz": 3 }],
     "durr": [4,5,6]
   }
}


You'd need a few rules for resolving ambiguities and attachments would 
probably have to use a different url (_attachment/file?) but it would be 
very convenient and RESTful.

Re: partial/diff updates?

Posted by Paul Davis <pa...@gmail.com>.
On Wed, Nov 26, 2008 at 8:00 PM, Noah Slater <ns...@apache.org> wrote:
> On Wed, Nov 26, 2008 at 05:34:14PM -0800, Jan Lehnardt wrote:
>> This keeps coming up. I am seeing the words "seems", "would", "some", "guess",
>> "scenario", "brokeback mountain". I rather see numbers supporting the
>> cause. So far, other things prove to be a bottleneck. I'd like to see a real
>> app that would benefit from this. I'm not saying there is none, I'd just like
>> to see it :)
>
> Brokeback mountain? What the... I think you've spent too long in the USA, you
> must be getting travel sick. Come back to Europe as quick as you can!
>
> My opinion is that if the community (CouchDB or JSON) can propose a workable
> JSON diff format with interoperable client implementations this would be a huge
> win for CouchDB, I wouldn't even need a specific use case to convince me.
>
> Damien points out that oftentimes large documents can be broken down into
> smaller documents, and you, quite rightly, ask for specific use cases. My view
> is that if the community can standardise this outside of CouchDB it's a win/win
> situation for everyone if we implement it.
>
> --
> Noah Slater, http://tumbolia.org/nslater
>

I'm with Noah on this one. Unlike other things I've seen recently, I
do not want to implement shoddy half thought out specs on the fly in
CouchDB. If the JSON community picked one, adding support would
probably be trivial and useful. Until then I don't really see much of
a point in creating some couchdb specific one.

That's not to say that we might not want to poke the JSON community
and say "decide" and use a bit of jan's clout to make things happen.
:D

Paul

Re: partial/diff updates?

Posted by Jan Lehnardt <ja...@apache.org>.
On 26 Nov 2008, at 18:00, Noah Slater wrote:

> On Wed, Nov 26, 2008 at 05:34:14PM -0800, Jan Lehnardt wrote:
>> This keeps coming up. I am seeing the words "seems", "would",  
>> "some", "guess",
>> "scenario", "brokeback mountain". I rather see numbers supporting the
>> cause. So far, other things prove to be a bottleneck. I'd like to  
>> see a real
>> app that would benefit from this. I'm not saying there is none, I'd  
>> just like
>> to see it :)
>
> Brokeback mountain? What the... I think you've spent too long in the  
> USA, you
> must be getting travel sick. Come back to Europe as quick as you can!

I could have sworn that was in there somewhere :)

Cheers
Jan
--

Re: partial/diff updates?

Posted by Noah Slater <ns...@apache.org>.
On Wed, Nov 26, 2008 at 05:34:14PM -0800, Jan Lehnardt wrote:
> This keeps coming up. I am seeing the words "seems", "would", "some", "guess",
> "scenario", "brokeback mountain". I rather see numbers supporting the
> cause. So far, other things prove to be a bottleneck. I'd like to see a real
> app that would benefit from this. I'm not saying there is none, I'd just like
> to see it :)

Brokeback mountain? What the... I think you've spent too long in the USA, you
must be getting travel sick. Come back to Europe as quick as you can!

My opinion is that if the community (CouchDB or JSON) can propose a workable
JSON diff format with interoperable client implementations this would be a huge
win for CouchDB, I wouldn't even need a specific use case to convince me.

Damien points out that oftentimes large documents can be broken down into
smaller documents, and you, quite rightly, ask for specific use cases. My view
is that if the community can standardise this outside of CouchDB it's a win/win
situation for everyone if we implement it.

-- 
Noah Slater, http://tumbolia.org/nslater

Re: partial/diff updates?

Posted by Jan Lehnardt <ja...@apache.org>.
On 26 Nov 2008, at 09:57, Liam Staskawicz wrote:

> When updating a document, is there any notion of submitting a  
> partial update?  It seems like being able to specify that only some  
> subset of the fields in a document should be updated would offer  
> some efficiency benefits.  I guess I had in mind some scenario where  
> CouchDB would create the updated record by merging the existing  
> revision with the new info and saving the updated revision, but I'm  
> still new to CouchDB so I don't have a good sense of whether this  
> tramples on any important concepts.
>
> Thanks for any thoughts.

This keeps coming up. I am seeing the words "seems", "would", "some",  
"guess", "scenario", "brokeback mountain". I rather see numbers  
supporting the cause. So far, other things prove to be a bottleneck.  
I'd like to see a real app that would benefit from this. I'm not  
saying there is none, I'd just like to see it :)

Cheers
Jan
--