You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@couchdb.apache.org by Antony Blakey <an...@gmail.com> on 2009/01/01 01:40:17 UTC

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

On 31/12/2008, at 11:29 PM, Geir Magnusson Jr. wrote:

> What trouble?  I think this is *exactly* what should be done - have  
> CouchDB store documents that are :
>
> {
>    metadata : { _rev : X, _id : Y, _woogie: Z, .... anything that  
> needs to be added in the future, like other metadata like last  
> update date... },
>    userdata : {  .... the document you want to store .... }
> }
>
> and then offer APIs that let you :
>
> a) get to this document, for libraries and clients that know they  
> are talking to Couch and want to manipulate at this level
>
> b) return and accept the userdocument directly, for clients that  
> just want to consume or produce  JSON data, w/o caring about the  
> internal housekeeping

One of the issues complicating the logic of this discussion is that  
the document id is both metadata and, conceptually, a document member.  
That's why, although the purest model is to have the userdata as a  
member within a Couch document as you suggest, this doesn't look that  
appealing:

{
   metadata: {
     id: ...
     rev: ...
     ...
   }
   data: {
     ... the user's document ...
   }
}

Furthermore, from a scalability perspective, always having the  
metadata when you have the document, isn't a problem - the metadata is  
constrained. The reverse situation of always having the data when you  
have the metadata, is not constrained because the data is arbitrarily  
large. IMO this means that a solution such as this:

{
   id: ...
   rev: ...
   ...
   data: {
     ... the user's document ...
   }
}

isn't such a good idea compared to this:

{
   _metadata: {
     id: ...
     rev: ...
   }
   ... the user's document ...
}

Unfortunately the reserved token makes the structure non-reflexive  
without transformation, and although that's not currently an issue, I  
can imagine it complicating certain use-cases. It makes the system  
more complicated to reason about.

I'm struggling to objectively evaluate this model and your reflexive  
model - given Damien's attitude to this issue, my motivation to do so  
is somewhat depressed :/

Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

Did you hear about the Buddhist who refused Novocain during a root  
canal?
His goal: transcend dental medication.

Re: Decision making process (Re: Changing rev to _rev in view results (Was: Re: newbie question #1))

Posted by Noah Slater <ns...@apache.org>.

On Sat, Jan 03, 2009 at 12:27:20AM +1030, Antony Blakey wrote:
> Is this voting/decision-making process visible?

Yes, the formal decision making process is visible. However, as a group of
people who work closely together we communicate in lots of other ways, be that
public IRC, private IRC, private email, Twitter, or telephone calls. It is
important that we discuss everything official on the mailing list as per The
Apache Way, but it is a natural consequence of our working together that we
become informally aware of each other's position on various topics. Short of
refusing to discuss CouchDB "in real life," I see this as unavoidable.

> If I'm trying to effect a change, how can I judge what would get it over the
> line i.e. who objects and for what reasons?

I guess the best way would be to do a proper proposal and request a vote. This
vote wouldn't be anything official, but it would be an informal way of
soliciting definite feedback from the community and the committers.

> How do I ensure that my patch isn't stuck in development hell while the PMC
> (maybe) waits for consensus to (maybe never) emerge?

If you submitted a patch to the mailing list and requested a vote.

> If I have a change in mind, I would prefer where possible to get at
> least a majority of the PMC on side before doing (possibly a lot of)
> work.

Makes sense. I guess that's where a proper proposal would come in.

> It looks like it is only when a patch is up for approval that any real
> decision will be made i.e. I can't prompt a decision on the metadata thread
> without doing all the work, which may be pointless. That makes a proposal very
> expensive, especially if the PMC doesn't operate in a predictable fashion.

Predictable fashion?

> That in turn makes it less likely that people will contribute, and seem to be
> a strategic weakness.

I'm not sure I follow your reasoning at all.

The way I see it is that if someone suggests something and we all love it, there
is a clear consensus and we would go forward with that. The reverse is also
true. In this particular case it is clear to me that we do not have consensus. I
think part of that may be due to there being no concrete proposal, everyone
seems to be suggesting different things!

We are only at this juncture because nobody can agree on the best way
forwards. All I am saying is that maybe a proper proposal would be a nice idea,
something we can all say "+1" on, or not as the case may be. If you asked me now
what my thoughts were on the whole topic I would tell you that I don't know,
because I'm not even sure what the agreed proposal is, let alone the agreed
consensus is.

> IMO an improvement to this process would be a mechanism to submit a proposal
> to the *PMC*, to get some definitive feedback, the contract being that I'll do
> the proposal and the implementation in return for the PMC being prepared to
> give a provisional indication that a) such work would not be in vain; or b)
> the proposal won't be pursued; or c) more work is needed, but it's not out of
> the question; and/or d) some comment about what would need to be addressed to
> move forward. All of which is to say: is the PMC reified in some way that
> doesn't involve canvassing each member individually?

This process is already in place! Heh. All anyone has to do is write up a
concrete proposal of the changes and submit it to the mailing list requesting a
vote. We can move forward from there. If there are comments about the proposal,
the sponsor could collect them and draft a new proposal. Rinse and repeat.

-- 
Noah Slater, http://tumbolia.org/nslater

Re: Decision making process (Re: Changing rev to _rev in view results (Was: Re: newbie question #1))

Posted by Antony Blakey <an...@gmail.com>.

On 03/01/2009, at 12:03 AM, Noah Slater wrote:

> It's worth noting that we may never make a decision, or the decision  
> might take
> a long time. Looking back at the newline proposal, we never decided  
> consensus
> was to reject it outright, but we never resolved it the other way  
> either. One of
> us could move to call a vote, but we all feel its better to wait for  
> consensus.

Is this voting/decision-making process visible? If I'm trying to  
effect a change, how can I judge what would get it over the line i.e.  
who objects and for what reasons? How do I ensure that my patch isn't  
stuck in development hell while the PMC (maybe) waits for consensus to  
(maybe never) emerge?

If I have a change in mind, I would prefer where possible to get at  
least a majority of the PMC on side before doing (possibly a lot of)  
work. It looks like it is only when a patch is up for approval that  
any real decision will be made i.e. I can't prompt a decision on the  
metadata thread without doing all the work, which may be pointless.  
That makes a proposal very expensive, especially if the PMC doesn't  
operate in a predictable fashion. That in turn makes it less likely  
that people will contribute, and seem to be a strategic weakness.

IMO an improvement to this process would be a mechanism to submit a  
proposal to the *PMC*, to get some definitive feedback, the contract  
being that I'll do the proposal and the implementation in return for  
the PMC being prepared to give a provisional indication that a) such  
work would not be in vain; or b) the proposal won't be pursued; or c)  
more work is needed, but it's not out of the question; and/or d) some  
comment about what would need to be addressed to move forward. All of  
which is to say: is the PMC reified in some way that doesn't involve  
canvassing each member individually?

Antony Blakey
--------------------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

You can't just ask customers what they want and then try to give that  
to them. By the time you get it built, they'll want something new.
   -- Steve Jobs

Re: Decision making process (Re: Changing rev to _rev in view results (Was: Re: newbie question #1))

Posted by Noah Slater <ns...@apache.org>.

On Fri, Jan 02, 2009 at 11:19:18PM +1030, Antony Blakey wrote:
> That would confirm my presumption that it's probably a waste of time pushing
> something that Damien's firmly against, as opposed to expecting to vote on it.

I'm not too sure this is a wise approach. Chris is correct in that we've opted
for following consensus in the past, but that doesn't mean that whatever Damien
says we all agree with. There's still the issue of the JSON newline which has
the committers split right down the middle!

I see that this issue can take one of three possible routes:

  * the PMC decide that consensus is to reject the proposal

  * the PMC decide that opinion is spit and hold an ASF style vote

  * the PMC decide that consensus is to accept the proposal

The PMC comprises me, Chris, Chris, Jan, and Damien.

It's worth noting that we may never make a decision, or the decision might take
a long time. Looking back at the newline proposal, we never decided consensus
was to reject it outright, but we never resolved it the other way either. One of
us could move to call a vote, but we all feel its better to wait for consensus.

-- 
Noah Slater, http://tumbolia.org/nslater

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Noah Slater <ns...@apache.org>.

On Fri, Jan 02, 2009 at 04:03:50PM +1030, Antony Blakey wrote:
> It's never been clear to me that there is a process for voting - the decision
> making process within the commit group seems opaque.

If a patch was submitted, and it was in question, we could have a standard ASF
vote to decide if we wanted to accept it or not.

> I think a change to the API could be decided without reference to the code
> implementing that change. In fact, IMO the API *should* be considered
> separately from the code implementing that change. Otherwise APIs will tend to
> be decided not on the basis of design, but on the amount of effort some person
> is prepared to spend to demonstrate it, and hence code inertia, often
> resulting in expedient solutions. This means that good, but expensive ideas,
> can be lost.

This implies that expedient solutions are necessarily bad. I actually think
there is a trade-off between conceptually good ideas, and practically good
ideas. One of the things that separates those is working code! That is to say,
it's easy to come up with ideas, but harder to execute them -- and I think both
of those factors are important when you're discussing the way forward.

-- 
Noah Slater, http://tumbolia.org/nslater

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.

exactly :)

On Jan 4, 2009, at 11:35 AM, Michael Fellinger wrote:

> On Fri, Jan 2, 2009 at 9:30 PM, Robert Dionne <bo...@gmail.com>  
> wrote:
>> When it comes to design I think there are always tradeoffs. This is  
>> where
>> intuition and experience count. In my opinion separating metadata  
>> from the
>> user's data is a more complex approach. It creates two parts to the
>> document, they have to be handled separately and it creates the  
>> need for two
>> kinds of API calls for the two types of data.  It seems like a good
>> approach, however it's very easy to look at an existing  
>> implementation and
>> see how things "ought" to be done.
>>
>> The current implementation has a nice simplicity to it that I would  
>> not
>> readily advocate changing. My first impression is that it reminded  
>> me of
>> Berkeley DB on steroids. The convention governing the use of the  
>> _id is not
>> that hard to deal with and it doesn't prevent one from handling  
>> JSON docs
>> that come from elsewhere. It seems that converting data from one  
>> database
>> system to another always involves some transformation.
>
> For what it's worth, I'd love to see a separation of data and data
> about data, and I'd also like to propose a change to map functions:
>
> function(doc, meta){ emit(doc.title, [meta.id, meta.ref]); }
> function(doc){ emit(doc.title); }
>
> That way people who are complaining about having to type doc.doc.title
> can have their peace of mind as well.
>
> I might see the whole issue very limited, but it makes absolutely the
> most sense to me, everybody can stop worrying about which data is
> allowed in the data part and the metadata can grow in any direction.
> Nobody is annoyed by prefixing _ anymore (symetric API) and there is
> little technical need anymore for validating documents (as long as
> they can be serialized to valid JSON) before putting them into the db.
>
> ^ manveru
>
>> This discussion reminds me of Perlis' epigram(#15) that everything  
>> should be
>> built top down, except the first time.
>>
>>
>> On Jan 2, 2009, at 12:33 AM, Antony Blakey wrote:
>>
>>>
>>> On 02/01/2009, at 2:17 PM, Noah Slater wrote:
>>>
>>>> I appreciate you're frustrated with the current situation Antony,  
>>>> but I
>>>> think
>>>> it's unfair for you to be claiming any kind of consensus without  
>>>> a vote.
>>>
>>> That post wasn't meant to be a criticism. Apologies if it felt  
>>> like it
>>> was.
>>>
>>> There isn't a clear consensus in this thread, which to my mind  
>>> reflects
>>> the fact that there are trade-offs that don't have objective  
>>> evaluation
>>> measures.
>>>
>>> I fully support the idea that a product should reflect the vision  
>>> and
>>> opinion of a very small group. Abstracting from my preference for  
>>> a more
>>> robustly theoretical approach to API desig, the holistically best  
>>> result is
>>> likely to arise from this model. So I don't e.g. mean 'gatekeeper'  
>>> in a
>>> negative way.
>>>
>>>> I would
>>>> be interested in seeing a patch, explanation, and vote. I've  
>>>> already
>>>> expressed
>>>> my agreement with many of the points you've raised, and I'm not  
>>>> the only
>>>> one.
>>>
>>> I was only referring to a lack of expressed support for a fully  
>>> reflexive
>>> model.
>>>
>>> It's never been clear to me that there is a process for voting - the
>>> decision making process within the commit group seems opaque.
>>>
>>>> It's pretty pointless for us to keep sending emails over proposed  
>>>> changes
>>>> to the
>>>> code without actually seeing the changes.
>>>
>>> I think a change to the API could be decided without reference to  
>>> the code
>>> implementing that change. In fact, IMO the API *should* be  
>>> considered
>>> separately from the code implementing that change. Otherwise APIs  
>>> will tend
>>> to be decided not on the basis of design, but on the amount of  
>>> effort some
>>> person is prepared to spend to demonstrate it, and hence code  
>>> inertia, often
>>> resulting in expedient solutions. This means that good, but  
>>> expensive ideas,
>>> can be lost.
>>>
>>> The models under discussion have evolved from simple name identity  
>>> by
>>> using '_id' and '_rev' everywhere, to a '_meta' wrapper, to Geir's  
>>> fully
>>> reflexive model.
>>>
>>> So I'd prefer to get buy-in to a model or principles, at which point
>>> anyone could implement it. That's what I tried to do with the  
>>> change to the
>>> FS layout to support i18n, the committable implementation of which  
>>> is my
>>> focus right now.

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by "ara.t.howard" <ar...@gmail.com>.

On Jan 4, 2009, at 9:35 AM, Michael Fellinger wrote:

> For what it's worth, I'd love to see a separation of data and data
> about data, and I'd also like to propose a change to map functions:
>
> function(doc, meta){ emit(doc.title, [meta.id, meta.ref]); }
> function(doc){ emit(doc.title); }
>
> That way people who are complaining about having to type doc.doc.title
> can have their peace of mind as well.
>
> I might see the whole issue very limited, but it makes absolutely the
> most sense to me, everybody can stop worrying about which data is
> allowed in the data part and the metadata can grow in any direction.
> Nobody is annoyed by prefixing _ anymore (symetric API) and there is
> little technical need anymore for validating documents (as long as
> they can be serialized to valid JSON) before putting them into the db.
>
> ^ manveru



i think those of us who have maintained a lot of code ourselves see  
the wisdom in this.  having rules that are 100% consistent lifts a  
massive mental weight for anyone connected to the code.

keys.each do |key|
   unless key =~ %r/^_/
     ...


just gives me hives to think about.

a @ http://codeforpeople.com/
--
we can deny everything, except that we have the possibility of being  
better. simply reflect on that.
h.h. the 14th dalai lama

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Michael Fellinger <m....@gmail.com>.

On Fri, Jan 2, 2009 at 9:30 PM, Robert Dionne <bo...@gmail.com> wrote:
> When it comes to design I think there are always tradeoffs. This is where
> intuition and experience count. In my opinion separating metadata from the
> user's data is a more complex approach. It creates two parts to the
> document, they have to be handled separately and it creates the need for two
> kinds of API calls for the two types of data.  It seems like a good
> approach, however it's very easy to look at an existing implementation and
> see how things "ought" to be done.
>
> The current implementation has a nice simplicity to it that I would not
> readily advocate changing. My first impression is that it reminded me of
> Berkeley DB on steroids. The convention governing the use of the _id is not
> that hard to deal with and it doesn't prevent one from handling JSON docs
> that come from elsewhere. It seems that converting data from one database
> system to another always involves some transformation.

For what it's worth, I'd love to see a separation of data and data
about data, and I'd also like to propose a change to map functions:

function(doc, meta){ emit(doc.title, [meta.id, meta.ref]); }
function(doc){ emit(doc.title); }

That way people who are complaining about having to type doc.doc.title
can have their peace of mind as well.

I might see the whole issue very limited, but it makes absolutely the
most sense to me, everybody can stop worrying about which data is
allowed in the data part and the metadata can grow in any direction.
Nobody is annoyed by prefixing _ anymore (symetric API) and there is
little technical need anymore for validating documents (as long as
they can be serialized to valid JSON) before putting them into the db.

^ manveru

> This discussion reminds me of Perlis' epigram(#15) that everything should be
> built top down, except the first time.
>
>
> On Jan 2, 2009, at 12:33 AM, Antony Blakey wrote:
>
>>
>> On 02/01/2009, at 2:17 PM, Noah Slater wrote:
>>
>>> I appreciate you're frustrated with the current situation Antony, but I
>>> think
>>> it's unfair for you to be claiming any kind of consensus without a vote.
>>
>> That post wasn't meant to be a criticism. Apologies if it felt like it
>> was.
>>
>> There isn't a clear consensus in this thread, which to my mind reflects
>> the fact that there are trade-offs that don't have objective evaluation
>> measures.
>>
>> I fully support the idea that a product should reflect the vision and
>> opinion of a very small group. Abstracting from my preference for a more
>> robustly theoretical approach to API desig, the holistically best result is
>> likely to arise from this model. So I don't e.g. mean 'gatekeeper' in a
>> negative way.
>>
>>> I would
>>> be interested in seeing a patch, explanation, and vote. I've already
>>> expressed
>>> my agreement with many of the points you've raised, and I'm not the only
>>> one.
>>
>> I was only referring to a lack of expressed support for a fully reflexive
>> model.
>>
>> It's never been clear to me that there is a process for voting - the
>> decision making process within the commit group seems opaque.
>>
>>> It's pretty pointless for us to keep sending emails over proposed changes
>>> to the
>>> code without actually seeing the changes.
>>
>> I think a change to the API could be decided without reference to the code
>> implementing that change. In fact, IMO the API *should* be considered
>> separately from the code implementing that change. Otherwise APIs will tend
>> to be decided not on the basis of design, but on the amount of effort some
>> person is prepared to spend to demonstrate it, and hence code inertia, often
>> resulting in expedient solutions. This means that good, but expensive ideas,
>> can be lost.
>>
>> The models under discussion have evolved from simple name identity by
>> using '_id' and '_rev' everywhere, to a '_meta' wrapper, to Geir's fully
>> reflexive model.
>>
>> So I'd prefer to get buy-in to a model or principles, at which point
>> anyone could implement it. That's what I tried to do with the change to the
>> FS layout to support i18n, the committable implementation of which is my
>> focus right now.

Re: Decision making process (Re: Changing rev to _rev in view results (Was: Re: newbie question #1))

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.

On Jan 2, 2009, at 7:37 AM, Chris Anderson wrote:

> On Fri, Jan 2, 2009 at 4:16 AM, Geir Magnusson Jr. <ge...@pobox.com>  
> wrote:
>>
>> On Jan 2, 2009, at 12:33 AM, Antony Blakey wrote:
>>>
>>> It's never been clear to me that there is a process for voting - the
>>> decision making process within the commit group seems opaque.
>>
>> How can that be?  I assume that all decisions are made in public on  
>> the dev@
>> mailing list.
>>
>
> The committers have a history of deferring to Damien (especially on
> deeply technical matters like the document identity model). It's fair
> to say that most of what we do is bug fixes and the like. When we have
> a new feature or module under development, we like to run the code by
> Damien before we commit it. He understands CouchDB inside and out, and
> he's pretty good at seeing how an API detail or caching property will
> effect the big picture of how people use CouchDB.

That's perfectly reasonable if it's on the dev@ list.

>
>
> There have been votes on the dev list before, but they are rare
> because we so often move with consensus.

That's also perfectly reasonable, and in fact, how I personally like  
to participate - it seems that if it comes down to a vote, the  
consensus mode has failed.

geir

>
>
> -- 
> Chris Anderson
> http://jchris.mfdz.com

Re: Decision making process (Re: Changing rev to _rev in view results (Was: Re: newbie question #1))

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.

On Jan 2, 2009, at 7:49 AM, Antony Blakey wrote:

>
> On 02/01/2009, at 11:07 PM, Chris Anderson wrote:
>
>> On Fri, Jan 2, 2009 at 4:16 AM, Geir Magnusson Jr. <ge...@pobox.com>  
>> wrote:
>>>
>>> On Jan 2, 2009, at 12:33 AM, Antony Blakey wrote:
>>>>
>>>> It's never been clear to me that there is a process for voting -  
>>>> the
>>>> decision making process within the commit group seems opaque.
>>>
>>> How can that be?  I assume that all decisions are made in public  
>>> on the dev@
>>> mailing list.
>>>
>>
>> The committers have a history of deferring to Damien (especially on
>> deeply technical matters like the document identity model). It's fair
>> to say that most of what we do is bug fixes and the like. When we  
>> have
>> a new feature or module under development, we like to run the code by
>> Damien before we commit it. He understands CouchDB inside and out,  
>> and
>> he's pretty good at seeing how an API detail or caching property will
>> effect the big picture of how people use CouchDB.
>
> That would confirm my presumption that it's probably a waste of time  
> pushing something that Damien's firmly against, as opposed to  
> expecting to vote on it.

That would be the case of any committer - anyone can veto a code  
change, including one by Damien  (Technically it's a PMC member, but  
some communities include all committers in this, which IMO is a good  
way)

However, the fact that people defer to Damien is perfectly  
reasonable.  (But remember the goal of an ASF community - you want to  
make it strong enough that the original founders/motivators can step  
back)

>
>
>> There have been votes on the dev list before, but they are rare
>> because we so often move with consensus.
>
> I'm merely curious bout this, but now that Couch is formally an  
> Apache project, is there some Apace mandated consensus-driven  
> decision making approach, or do they accept whatever model comes in  
> - I'm wondering if the ASF brand might 'mean' something in that  
> sense. And I've just noticed that Geir is on the ASF board - he  
> should know if anyone does!

I'm not here as anything but an interested user that knows something  
about the ASF, and I see in a followup that you found something on the  
website :)  All is well.

geir

Re: Decision making process (Re: Changing rev to _rev in view results (Was: Re: newbie question #1))

Posted by Antony Blakey <an...@gmail.com>.

On 02/01/2009, at 11:19 PM, Antony Blakey wrote:

> I'm merely curious bout this, but now that Couch is formally an  
> Apache project, is there some Apace mandated consensus-driven  
> decision making approach, or do they accept whatever model comes in  
> - I'm wondering if the ASF brand might 'mean' something in that  
> sense. And I've just noticed that Geir is on the ASF board - he  
> should know if anyone does!

Ahh yes, I should have read the Apache site :/ I'll slap my own wrist.

Antony Blakey
--------------------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

Reflecting on W.H. Auden's contemplation of 'necessary murders' in the  
Spanish Civil War, George Orwell wrote that such amorality was only  
really possible, 'if you are the kind of person who is always  
somewhere else when the trigger is pulled'.
   -- John Birmingham, "Appeasing Jakarta"

Re: Decision making process (Re: Changing rev to _rev in view results (Was: Re: newbie question #1))

Posted by Antony Blakey <an...@gmail.com>.

On 02/01/2009, at 11:07 PM, Chris Anderson wrote:

> On Fri, Jan 2, 2009 at 4:16 AM, Geir Magnusson Jr. <ge...@pobox.com>  
> wrote:
>>
>> On Jan 2, 2009, at 12:33 AM, Antony Blakey wrote:
>>>
>>> It's never been clear to me that there is a process for voting - the
>>> decision making process within the commit group seems opaque.
>>
>> How can that be?  I assume that all decisions are made in public on  
>> the dev@
>> mailing list.
>>
>
> The committers have a history of deferring to Damien (especially on
> deeply technical matters like the document identity model). It's fair
> to say that most of what we do is bug fixes and the like. When we have
> a new feature or module under development, we like to run the code by
> Damien before we commit it. He understands CouchDB inside and out, and
> he's pretty good at seeing how an API detail or caching property will
> effect the big picture of how people use CouchDB.

That would confirm my presumption that it's probably a waste of time  
pushing something that Damien's firmly against, as opposed to  
expecting to vote on it.

> There have been votes on the dev list before, but they are rare
> because we so often move with consensus.

I'm merely curious bout this, but now that Couch is formally an Apache  
project, is there some Apace mandated consensus-driven decision making  
approach, or do they accept whatever model comes in - I'm wondering if  
the ASF brand might 'mean' something in that sense. And I've just  
noticed that Geir is on the ASF board - he should know if anyone does!

Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

The intuitive mind is a sacred gift and the rational mind is a  
faithful servant. We have created a society that honours the servant  
and has forgotten the gift.
   -- Albert Einstein

Re: Decision making process (Re: Changing rev to _rev in view results (Was: Re: newbie question #1))

Posted by Chris Anderson <jc...@gmail.com>.

On Fri, Jan 2, 2009 at 4:16 AM, Geir Magnusson Jr. <ge...@pobox.com> wrote:
>
> On Jan 2, 2009, at 12:33 AM, Antony Blakey wrote:
>>
>> It's never been clear to me that there is a process for voting - the
>> decision making process within the commit group seems opaque.
>
> How can that be?  I assume that all decisions are made in public on the dev@
> mailing list.
>

The committers have a history of deferring to Damien (especially on
deeply technical matters like the document identity model). It's fair
to say that most of what we do is bug fixes and the like. When we have
a new feature or module under development, we like to run the code by
Damien before we commit it. He understands CouchDB inside and out, and
he's pretty good at seeing how an API detail or caching property will
effect the big picture of how people use CouchDB.

There have been votes on the dev list before, but they are rare
because we so often move with consensus.

-- 
Chris Anderson
http://jchris.mfdz.com

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Antony Blakey <an...@gmail.com>.

On 02/01/2009, at 10:51 PM, Geir Magnusson Jr. wrote:

> But I am a little worried that if the "project lead" and  
> "gatekeepers" are even *thought* by active members of the community  
> to be dead-set against it that I would be wasting time.

To be blunt, and this is my impression only: I thought Damien was  
strongly against changing it, and got the impression that there was  
some significant degree of deferral to his opinion, understandably. It  
is difficult to get a handle on the social and procedural dynamic in  
this situation.

I really don't want any political/social meta-issue to get out of  
hand, sorry if I lead it in that direction.

> I think it's an important issue (clearly or I wouldn't be spending  
> so much vacation time discussing it), but it's not a make-or-break  
> issue for me.

Ditto.

Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

Nothing is really work unless you would rather be doing something else.
   -- J. M. Barre

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.

I actually don't mind digging in and spending the time showing what I  
mean with code (although the barrier here is much higher since I don't  
know erlang at all...)

But I am a little worried that if the "project lead" and "gatekeepers"  
are even *thought* by active members of the community to be dead-set  
against it that I would be wasting time.

I think it's an important issue (clearly or I wouldn't be spending so  
much vacation time discussing it), but it's not a make-or-break issue  
for me.

geir


On Jan 2, 2009, at 7:17 AM, Noah Slater wrote:

> Not really, my point stands. Show the code, then we vote. :)
>
> On the other hand, consensus building can't harm either.
>
> On Fri, Jan 02, 2009 at 07:14:26AM -0500, Geir Magnusson Jr. wrote:
>> Maybe the tradition should be of an Apache project? :)
>>
>> From my limited interaction w/ the Linux kernel community, it's a  
>> very
>> different beastie...
>>
>> geir
>>
>> On Jan 1, 2009, at 10:47 PM, Noah Slater wrote:
>>
>>> On Fri, Jan 02, 2009 at 09:15:21AM +1030, Antony Blakey wrote:
>>>> No. The primary reason is "why change - the current mechanism has
>>>> worked for a
>>>> year". Damien (project lead) doesn't regard change as necessary,  
>>>> and
>>>> a
>>>> significant change to support top-level reflexivity (which is your
>>>> primary
>>>> thrust) doesn't have support from the other gatekeepers. There is
>>>> some support
>>>> for name identity, although I suspect not enough to prompt a  
>>>> change.
>>>
>>> I appreciate you're frustrated with the current situation Antony,  
>>> but I
>>> think
>>> it's unfair for you to be claiming any kind of consensus without a
>>> vote. I would
>>> be interested in seeing a patch, explanation, and vote. I've already
>>> expressed
>>> my agreement with many of the points you've raised, and I'm not the
>>> only one.
>>>
>>> It's pretty pointless for us to keep sending emails over proposed
>>> changes to the
>>> code without actually seeing the changes. So, in the tradition of  
>>> the
>>> Linux
>>> kernel, show the code and let's have a vote!
>>>
>>
>
> -- 
> Noah Slater, http://tumbolia.org/nslater

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Noah Slater <ns...@apache.org>.

Not really, my point stands. Show the code, then we vote. :)

On the other hand, consensus building can't harm either.

On Fri, Jan 02, 2009 at 07:14:26AM -0500, Geir Magnusson Jr. wrote:
> Maybe the tradition should be of an Apache project? :)
>
> From my limited interaction w/ the Linux kernel community, it's a very
> different beastie...
>
> geir
>
> On Jan 1, 2009, at 10:47 PM, Noah Slater wrote:
>
>> On Fri, Jan 02, 2009 at 09:15:21AM +1030, Antony Blakey wrote:
>>> No. The primary reason is "why change - the current mechanism has
>>> worked for a
>>> year". Damien (project lead) doesn't regard change as necessary, and
>>> a
>>> significant change to support top-level reflexivity (which is your
>>> primary
>>> thrust) doesn't have support from the other gatekeepers. There is
>>> some support
>>> for name identity, although I suspect not enough to prompt a change.
>>
>> I appreciate you're frustrated with the current situation Antony, but I
>> think
>> it's unfair for you to be claiming any kind of consensus without a
>> vote. I would
>> be interested in seeing a patch, explanation, and vote. I've already
>> expressed
>> my agreement with many of the points you've raised, and I'm not the
>> only one.
>>
>> It's pretty pointless for us to keep sending emails over proposed
>> changes to the
>> code without actually seeing the changes. So, in the tradition of the
>> Linux
>> kernel, show the code and let's have a vote!
>>
>

-- 
Noah Slater, http://tumbolia.org/nslater

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.

Maybe the tradition should be of an Apache project? :)

 From my limited interaction w/ the Linux kernel community, it's a  
very different beastie...

geir

On Jan 1, 2009, at 10:47 PM, Noah Slater wrote:

> On Fri, Jan 02, 2009 at 09:15:21AM +1030, Antony Blakey wrote:
>> No. The primary reason is "why change - the current mechanism has  
>> worked for a
>> year". Damien (project lead) doesn't regard change as necessary,  
>> and a
>> significant change to support top-level reflexivity (which is your  
>> primary
>> thrust) doesn't have support from the other gatekeepers. There is  
>> some support
>> for name identity, although I suspect not enough to prompt a change.
>
> I appreciate you're frustrated with the current situation Antony,  
> but I think
> it's unfair for you to be claiming any kind of consensus without a  
> vote. I would
> be interested in seeing a patch, explanation, and vote. I've already  
> expressed
> my agreement with many of the points you've raised, and I'm not the  
> only one.
>
> It's pretty pointless for us to keep sending emails over proposed  
> changes to the
> code without actually seeing the changes. So, in the tradition of  
> the Linux
> kernel, show the code and let's have a vote!
>
> -- 
> Noah Slater, http://tumbolia.org/nslater

Decision making process (Re: Changing rev to _rev in view results (Was: Re: newbie question #1))

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.

On Jan 2, 2009, at 12:33 AM, Antony Blakey wrote:
>
> It's never been clear to me that there is a process for voting - the  
> decision making process within the commit group seems opaque.

How can that be?  I assume that all decisions are made in public on  
the dev@ mailing list.

geir

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Robert Dionne <bo...@gmail.com>.

When it comes to design I think there are always tradeoffs. This is  
where intuition and experience count. In my opinion separating  
metadata from the user's data is a more complex approach. It creates  
two parts to the document, they have to be handled separately and it  
creates the need for two kinds of API calls for the two types of  
data.  It seems like a good approach, however it's very easy to look  
at an existing implementation and see how things "ought" to be done.

The current implementation has a nice simplicity to it that I would  
not readily advocate changing. My first impression is that it  
reminded me of Berkeley DB on steroids. The convention governing the  
use of the _id is not that hard to deal with and it doesn't prevent  
one from handling JSON docs that come from elsewhere. It seems that  
converting data from one database system to another always involves  
some transformation.

This discussion reminds me of Perlis' epigram(#15) that everything  
should be built top down, except the first time.


On Jan 2, 2009, at 12:33 AM, Antony Blakey wrote:

>
> On 02/01/2009, at 2:17 PM, Noah Slater wrote:
>
>> I appreciate you're frustrated with the current situation Antony,  
>> but I think
>> it's unfair for you to be claiming any kind of consensus without a  
>> vote.
>
> That post wasn't meant to be a criticism. Apologies if it felt like  
> it was.
>
> There isn't a clear consensus in this thread, which to my mind  
> reflects the fact that there are trade-offs that don't have  
> objective evaluation measures.
>
> I fully support the idea that a product should reflect the vision  
> and opinion of a very small group. Abstracting from my preference  
> for a more robustly theoretical approach to API desig, the  
> holistically best result is likely to arise from this model. So I  
> don't e.g. mean 'gatekeeper' in a negative way.
>
>> I would
>> be interested in seeing a patch, explanation, and vote. I've  
>> already expressed
>> my agreement with many of the points you've raised, and I'm not  
>> the only one.
>
> I was only referring to a lack of expressed support for a fully  
> reflexive model.
>
> It's never been clear to me that there is a process for voting -  
> the decision making process within the commit group seems opaque.
>
>> It's pretty pointless for us to keep sending emails over proposed  
>> changes to the
>> code without actually seeing the changes.
>
> I think a change to the API could be decided without reference to  
> the code implementing that change. In fact, IMO the API *should* be  
> considered separately from the code implementing that change.  
> Otherwise APIs will tend to be decided not on the basis of design,  
> but on the amount of effort some person is prepared to spend to  
> demonstrate it, and hence code inertia, often resulting in  
> expedient solutions. This means that good, but expensive ideas, can  
> be lost.
>
> The models under discussion have evolved from simple name identity  
> by using '_id' and '_rev' everywhere, to a '_meta' wrapper, to  
> Geir's fully reflexive model.
>
> So I'd prefer to get buy-in to a model or principles, at which  
> point anyone could implement it. That's what I tried to do with the  
> change to the FS layout to support i18n, the committable  
> implementation of which is my focus right now.
>
> Antony Blakey
> -------------
> CTO, Linkuistics Pty Ltd
> Ph: 0438 840 787
>
> The intuitive mind is a sacred gift and the rational mind is a  
> faithful servant. We have created a society that honours the  
> servant and has forgotten the gift.
>   -- Albert Einstein
>
>

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Antony Blakey <an...@gmail.com>.

On 02/01/2009, at 2:17 PM, Noah Slater wrote:

> I appreciate you're frustrated with the current situation Antony,  
> but I think
> it's unfair for you to be claiming any kind of consensus without a  
> vote.

That post wasn't meant to be a criticism. Apologies if it felt like it  
was.

There isn't a clear consensus in this thread, which to my mind  
reflects the fact that there are trade-offs that don't have objective  
evaluation measures.

I fully support the idea that a product should reflect the vision and  
opinion of a very small group. Abstracting from my preference for a  
more robustly theoretical approach to API desig, the holistically best  
result is likely to arise from this model. So I don't e.g. mean  
'gatekeeper' in a negative way.

> I would
> be interested in seeing a patch, explanation, and vote. I've already  
> expressed
> my agreement with many of the points you've raised, and I'm not the  
> only one.

I was only referring to a lack of expressed support for a fully  
reflexive model.

It's never been clear to me that there is a process for voting - the  
decision making process within the commit group seems opaque.

> It's pretty pointless for us to keep sending emails over proposed  
> changes to the
> code without actually seeing the changes.

I think a change to the API could be decided without reference to the  
code implementing that change. In fact, IMO the API *should* be  
considered separately from the code implementing that change.  
Otherwise APIs will tend to be decided not on the basis of design, but  
on the amount of effort some person is prepared to spend to  
demonstrate it, and hence code inertia, often resulting in expedient  
solutions. This means that good, but expensive ideas, can be lost.

The models under discussion have evolved from simple name identity by  
using '_id' and '_rev' everywhere, to a '_meta' wrapper, to Geir's  
fully reflexive model.

So I'd prefer to get buy-in to a model or principles, at which point  
anyone could implement it. That's what I tried to do with the change  
to the FS layout to support i18n, the committable implementation of  
which is my focus right now.

Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

The intuitive mind is a sacred gift and the rational mind is a  
faithful servant. We have created a society that honours the servant  
and has forgotten the gift.
   -- Albert Einstein

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Noah Slater <ns...@apache.org>.

On Fri, Jan 02, 2009 at 09:15:21AM +1030, Antony Blakey wrote:
> No. The primary reason is "why change - the current mechanism has worked for a
> year". Damien (project lead) doesn't regard change as necessary, and a
> significant change to support top-level reflexivity (which is your primary
> thrust) doesn't have support from the other gatekeepers. There is some support
> for name identity, although I suspect not enough to prompt a change.

I appreciate you're frustrated with the current situation Antony, but I think
it's unfair for you to be claiming any kind of consensus without a vote. I would
be interested in seeing a patch, explanation, and vote. I've already expressed
my agreement with many of the points you've raised, and I'm not the only one.

It's pretty pointless for us to keep sending emails over proposed changes to the
code without actually seeing the changes. So, in the tradition of the Linux
kernel, show the code and let's have a vote!

-- 
Noah Slater, http://tumbolia.org/nslater

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Antony Blakey <an...@gmail.com>.

On 02/01/2009, at 9:32 PM, Jan Lehnardt wrote:

>
> On 2 Jan 2009, at 01:16, Geir Magnusson Jr. wrote:
>
>> ok - that's the first time in the 9 years I've been associated with  
>> the ASF that I've heard that term.  Sounds kinda like they're goal  
>> is to keep things out, rather than get people and ideas in and  
>> involved :)
>
> This is not an ASF-term (afaik), but Anthony's. :)

Correct. I use it because I though it reflected the decision making  
structure, and I agree with that structure i.e. a community that makes  
arguments to convince a committee, said committee being the actual  
decision makers. In this case, the people with commit rights are  
effectively the decision makers, because Couch will be what they commit.

Maybe the term causes distress. I'll cease and desist.

Antony Blakey
--------------------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

Only two things are infinite, the universe and human stupidity, and  
I'm not sure about the former.
  -- Albert Einstein

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Jan Lehnardt <ja...@apache.org>.

On 2 Jan 2009, at 01:16, Geir Magnusson Jr. wrote:

> ok - that's the first time in the 9 years I've been associated with  
> the ASF that I've heard that term.  Sounds kinda like they're goal  
> is to keep things out, rather than get people and ideas in and  
> involved :)

This is not an ASF-term (afaik), but Anthony's. :)

Cheers
Jan
--

>
>
> geir
>
> On Jan 1, 2009, at 6:51 PM, Antony Blakey wrote:
>
>>
>> On 02/01/2009, at 9:42 AM, Geir Magnusson Jr. wrote:
>>
>>> What is this "gatekeeper" thing I keep hearing about?  Do you mean  
>>> committer?
>>
>> Yes.
>>
>> Antony Blakey
>> --------------------------
>> CTO, Linkuistics Pty Ltd
>> Ph: 0438 840 787
>>
>> Man will never be free until the last king is strangled with the  
>> entrails of the last priest.
>> -- Denis Diderot
>>
>
>

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.

ok - that's the first time in the 9 years I've been associated with  
the ASF that I've heard that term.  Sounds kinda like they're goal is  
to keep things out, rather than get people and ideas in and involved :)

geir

On Jan 1, 2009, at 6:51 PM, Antony Blakey wrote:

>
> On 02/01/2009, at 9:42 AM, Geir Magnusson Jr. wrote:
>
>> What is this "gatekeeper" thing I keep hearing about?  Do you mean  
>> committer?
>
> Yes.
>
> Antony Blakey
> --------------------------
> CTO, Linkuistics Pty Ltd
> Ph: 0438 840 787
>
> Man will never be free until the last king is strangled with the  
> entrails of the last priest.
>  -- Denis Diderot
>

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Antony Blakey <an...@gmail.com>.

On 02/01/2009, at 9:42 AM, Geir Magnusson Jr. wrote:

> What is this "gatekeeper" thing I keep hearing about?  Do you mean  
> committer?

Yes.

Antony Blakey
--------------------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

Man will never be free until the last king is strangled with the  
entrails of the last priest.
   -- Denis Diderot

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.

What is this "gatekeeper" thing I keep hearing about?  Do you mean  
committer?

On Jan 1, 2009, at 5:45 PM, Antony Blakey wrote:

>
> On 02/01/2009, at 8:15 AM, Geir Magnusson Jr. wrote:
>
>> If you could point me to an explanation of why changing this is  
>> bad, I'd love to catch up on the discussion.  I assume it's a  
>> technical reason?
>
> No. The primary reason is "why change - the current mechanism has  
> worked for a year". Damien (project lead) doesn't regard change as  
> necessary, and a significant change to support top-level reflexivity  
> (which is your primary thrust) doesn't have support from the other  
> gatekeepers. There is some support for name identity, although I  
> suspect not enough to prompt a change.
>
> Antony Blakey
> -------------
> CTO, Linkuistics Pty Ltd
> Ph: 0438 840 787
>
> There are two ways of constructing a software design: One way is to  
> make it so simple that there are obviously no deficiencies, and the  
> other way is to make it so complicated that there are no obvious  
> deficiencies.
>  -- C. A. R. Hoare
>
>

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.

On Jan 1, 2009, at 5:45 PM, Antony Blakey wrote:

>
> On 02/01/2009, at 8:15 AM, Geir Magnusson Jr. wrote:
>
>> If you could point me to an explanation of why changing this is  
>> bad, I'd love to catch up on the discussion.  I assume it's a  
>> technical reason?
>
> No. The primary reason is "why change - the current mechanism has  
> worked for a year". Damien (project lead) doesn't regard change as  
> necessary, and a significant change to support top-level reflexivity  
> (which is your primary thrust) doesn't have support from the other  
> gatekeepers. There is some support for name identity, although I  
> suspect not enough to prompt a change.
>

Alright.  I hereby cry "Uncle" and will let this go (for a while,  
anyway...)

Thanks all for such an interesting conversation.  I look forward to  
more in this community, as I really find document and other  
"alternative" databases interesting, and think they are going to be  
key for "cloud" computing...

/me wanders off to try and get more that 6 docs/sec into Couch...

geir

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Antony Blakey <an...@gmail.com>.

On 02/01/2009, at 8:15 AM, Geir Magnusson Jr. wrote:

> If you could point me to an explanation of why changing this is bad,  
> I'd love to catch up on the discussion.  I assume it's a technical  
> reason?

No. The primary reason is "why change - the current mechanism has  
worked for a year". Damien (project lead) doesn't regard change as  
necessary, and a significant change to support top-level reflexivity  
(which is your primary thrust) doesn't have support from the other  
gatekeepers. There is some support for name identity, although I  
suspect not enough to prompt a change.

Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

There are two ways of constructing a software design: One way is to  
make it so simple that there are obviously no deficiencies, and the  
other way is to make it so complicated that there are no obvious  
deficiencies.
   -- C. A. R. Hoare

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Jan Lehnardt <ja...@apache.org>.

On 2 Jan 2009, at 02:24, Geir Magnusson Jr. wrote:

> BTW, for maximum utility,  I think that the view API will have to  
> change as well.  There's incredible power in the CDB view model, but  
> you'll want to be able to return a pure "user document" from a call  
> to a view (conform to some specific "schema"), rather than at least  
> what I understand is the current metadata-oriented structure.

Can you expand on that? With examples and all?

Cheers
Jan
--


>
> On Jan 1, 2009, at 7:53 PM, Damien Katz wrote:
>
>> Why can't you just always stick the desired document into an body  
>> field on the document? If you always do that, then you can round  
>> trip without problem.
>>
>> -Damien
>>
>>
>> On Jan 1, 2009, at 7:17 PM, Geir Magnusson Jr. wrote:
>>
>>>
>>> On Jan 1, 2009, at 7:14 PM, Adam Kocoloski wrote:
>>>
>>>> On Jan 1, 2009, at 4:45 PM, Geir Magnusson Jr. wrote:
>>>>
>>>>> b) I should have the choice to not have it injected at all
>>>>>
>>>>> So why do I think this is a problem?  The 10gen appserver auto- 
>>>>> injects an id field into the JSON documents that are stored in  
>>>>> our database, Mongo.  Can you guess what the key is?  Yep - "_id"
>>>>>
>>>>> So how can I roundtrip a doc from 10gen through couch and back?   
>>>>> I can't.
>>>>
>>>> Perhaps its worth noting that CouchDB is perfectly comfortable  
>>>> with externally generated _ids.  It only injects an _id if you  
>>>> create a new document without one.  Best,
>>>
>>> I understand that.
>>>
>>> I was just pointing out a real-world case where a JSON doc from  
>>> "somewhere else" runs into trouble...  (and yes, the issue applies  
>>> equally to the 10gen platform, when coming from "somewhere else" :)
>>>
>>> geir
>>>
>>>>
>>
>
>

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.

On Jan 1, 2009, at 9:14 PM, Flinn Mueller wrote:

> On Jan 1, 2009, at 8:24 PM, Geir Magnusson Jr. wrote:
>
>> I know I can do that.  And if CouchDB is the only "JSON source"  
>> that my apps are talking to, then that's fine - all apps can be  
>> written to expect that "schema".
>>
>> But I'm taking a different POV - where a "schema" exists outside of  
>> my app (a pseudo-standard defined by someone else) and I want to  
>> use CouchDB as a source of documents that conform to that schema.   
>> My apps should be able to consume documents in that "JSON schema"  
>> that are sourced from CouchDB, a httpd server returning static  
>> documents, some servlet app running in Tomcat, some .NET thingy, etc.
>>
>> Once you force me to store documents in a new format in order to  
>> protect data in my document that clashes w/ the server's metadata  
>> by sticking the document of interest in a top-level field :
>
>
> Isn't this any issue with any data store?  It's established that _id  
> is arbitrary just like it could be in just about any other data  
> store.  If this is a problem in couchdb it's a problem for you in  
> any data store isn't it?

If the data store chooses to inject it's metadata into my document,  
yes it will be, and that's my point.

>
>
>
>> {
>>   _rev : ...
>>   _id : ...
>>   mydata : { ... the real document ... }
>> }
>>
>> then I think that CDB loses something in terms of being a general  
>> JSON document store.
>
>
>
> You're looking at couchdb's document as if it's your JSON document.   
> It's couchdb's document and it happens to be JSON.

Exactly. That is what I note below - that CDB isn't a "general JSON  
store", it's a store that "renders" it's data in JSON.  (I hope it's  
clear that I think the world needs a "general JSON store" :)

>  There is nothing at all wrong with the above "schema" and it's  
> arguably the best way to store a document that you don't want to  
> conflict.  The couchdb document is always going to need metadata.   
> If it's not in _id then it's _farfagnugen and someone will  
> inevitably have the same issue.

Yep, exactly!  That's why I suggest formally separating the metadata  
out of the user data, and enhance the REST and view APIs so that you  
can get

a) just the user data (e.g. my AJAX app doesn't care or worry about  
the metadata)
b) both meta and user in the format like above
c) only meta for things where you don't care about the user data
d) maybe even a legacy mode where you inject meta into the userdata as  
it is today

geir

>
>
>
>
>> Now, I realize that no one ever said that CDB is a general JSON  
>> document store, rather it's a datastore that happens to return data  
>> in JSON.  The different is subtle, but very important.   It will be  
>> interesting to see how this space ("document databases") plays out,  
>> and if my concerns are valid.  Time will tell, I guess.
>>
>> BTW, for maximum utility,  I think that the view API will have to  
>> change as well.  There's incredible power in the CDB view model,  
>> but you'll want to be able to return a pure "user document" from a  
>> call to a view (conform to some specific "schema"), rather than at  
>> least what I understand is the current metadata-oriented structure.
>>
>> geir
>>
>>
>>
>>
>> On Jan 1, 2009, at 7:53 PM, Damien Katz wrote:
>>
>>> Why can't you just always stick the desired document into an body  
>>> field on the document? If you always do that, then you can round  
>>> trip without problem.
>>>
>>> -Damien
>>>
>>>
>>> On Jan 1, 2009, at 7:17 PM, Geir Magnusson Jr. wrote:
>>>
>>>>
>>>> On Jan 1, 2009, at 7:14 PM, Adam Kocoloski wrote:
>>>>
>>>>> On Jan 1, 2009, at 4:45 PM, Geir Magnusson Jr. wrote:
>>>>>
>>>>>> b) I should have the choice to not have it injected at all
>>>>>>
>>>>>> So why do I think this is a problem?  The 10gen appserver auto- 
>>>>>> injects an id field into the JSON documents that are stored in  
>>>>>> our database, Mongo.  Can you guess what the key is?  Yep - "_id"
>>>>>>
>>>>>> So how can I roundtrip a doc from 10gen through couch and  
>>>>>> back?  I can't.
>>>>>
>>>>> Perhaps its worth noting that CouchDB is perfectly comfortable  
>>>>> with externally generated _ids.  It only injects an _id if you  
>>>>> create a new document without one.  Best,
>>>>
>>>> I understand that.
>>>>
>>>> I was just pointing out a real-world case where a JSON doc from  
>>>> "somewhere else" runs into trouble...  (and yes, the issue  
>>>> applies equally to the 10gen platform, when coming from  
>>>> "somewhere else" :)
>>>>
>>>> geir
>>>>
>>>>>
>>>
>>
>

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Flinn Mueller <th...@gmail.com>.

On Jan 1, 2009, at 8:24 PM, Geir Magnusson Jr. wrote:

> I know I can do that.  And if CouchDB is the only "JSON source" that  
> my apps are talking to, then that's fine - all apps can be written  
> to expect that "schema".
>
> But I'm taking a different POV - where a "schema" exists outside of  
> my app (a pseudo-standard defined by someone else) and I want to use  
> CouchDB as a source of documents that conform to that schema.  My  
> apps should be able to consume documents in that "JSON schema" that  
> are sourced from CouchDB, a httpd server returning static documents,  
> some servlet app running in Tomcat, some .NET thingy, etc.
>
> Once you force me to store documents in a new format in order to  
> protect data in my document that clashes w/ the server's metadata by  
> sticking the document of interest in a top-level field :


Isn't this any issue with any data store?  It's established that _id  
is arbitrary just like it could be in just about any other data  
store.  If this is a problem in couchdb it's a problem for you in any  
data store isn't it?


> {
>    _rev : ...
>    _id : ...
>    mydata : { ... the real document ... }
> }
>
> then I think that CDB loses something in terms of being a general  
> JSON document store.



You're looking at couchdb's document as if it's your JSON document.   
It's couchdb's document and it happens to be JSON.  There is nothing  
at all wrong with the above "schema" and it's arguably the best way to  
store a document that you don't want to conflict.  The couchdb  
document is always going to need metadata.  If it's not in _id then  
it's _farfagnugen and someone will inevitably have the same issue.



> Now, I realize that no one ever said that CDB is a general JSON  
> document store, rather it's a datastore that happens to return data  
> in JSON.  The different is subtle, but very important.   It will be  
> interesting to see how this space ("document databases") plays out,  
> and if my concerns are valid.  Time will tell, I guess.
>
> BTW, for maximum utility,  I think that the view API will have to  
> change as well.  There's incredible power in the CDB view model, but  
> you'll want to be able to return a pure "user document" from a call  
> to a view (conform to some specific "schema"), rather than at least  
> what I understand is the current metadata-oriented structure.
>
> geir
>
>
>
>
> On Jan 1, 2009, at 7:53 PM, Damien Katz wrote:
>
>> Why can't you just always stick the desired document into an body  
>> field on the document? If you always do that, then you can round  
>> trip without problem.
>>
>> -Damien
>>
>>
>> On Jan 1, 2009, at 7:17 PM, Geir Magnusson Jr. wrote:
>>
>>>
>>> On Jan 1, 2009, at 7:14 PM, Adam Kocoloski wrote:
>>>
>>>> On Jan 1, 2009, at 4:45 PM, Geir Magnusson Jr. wrote:
>>>>
>>>>> b) I should have the choice to not have it injected at all
>>>>>
>>>>> So why do I think this is a problem?  The 10gen appserver auto- 
>>>>> injects an id field into the JSON documents that are stored in  
>>>>> our database, Mongo.  Can you guess what the key is?  Yep - "_id"
>>>>>
>>>>> So how can I roundtrip a doc from 10gen through couch and back?   
>>>>> I can't.
>>>>
>>>> Perhaps its worth noting that CouchDB is perfectly comfortable  
>>>> with externally generated _ids.  It only injects an _id if you  
>>>> create a new document without one.  Best,
>>>
>>> I understand that.
>>>
>>> I was just pointing out a real-world case where a JSON doc from  
>>> "somewhere else" runs into trouble...  (and yes, the issue applies  
>>> equally to the 10gen platform, when coming from "somewhere else" :)
>>>
>>> geir
>>>
>>>>
>>
>

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.

I know I can do that.  And if CouchDB is the only "JSON source" that  
my apps are talking to, then that's fine - all apps can be written to  
expect that "schema".

But I'm taking a different POV - where a "schema" exists outside of my  
app (a pseudo-standard defined by someone else) and I want to use  
CouchDB as a source of documents that conform to that schema.  My apps  
should be able to consume documents in that "JSON schema" that are  
sourced from CouchDB, a httpd server returning static documents, some  
servlet app running in Tomcat, some .NET thingy, etc.

Once you force me to store documents in a new format in order to  
protect data in my document that clashes w/ the server's metadata by  
sticking the document of interest in a top-level field :

  {
     _rev : ...
     _id : ...
     mydata : { ... the real document ... }
  }

then I think that CDB loses something in terms of being a general JSON  
document store.

Now, I realize that no one ever said that CDB is a general JSON  
document store, rather it's a datastore that happens to return data in  
JSON.  The different is subtle, but very important.   It will be  
interesting to see how this space ("document databases") plays out,  
and if my concerns are valid.  Time will tell, I guess.

BTW, for maximum utility,  I think that the view API will have to  
change as well.  There's incredible power in the CDB view model, but  
you'll want to be able to return a pure "user document" from a call to  
a view (conform to some specific "schema"), rather than at least what  
I understand is the current metadata-oriented structure.

geir

On Jan 1, 2009, at 7:53 PM, Damien Katz wrote:

> Why can't you just always stick the desired document into an body  
> field on the document? If you always do that, then you can round  
> trip without problem.
>
> -Damien
>
>
> On Jan 1, 2009, at 7:17 PM, Geir Magnusson Jr. wrote:
>
>>
>> On Jan 1, 2009, at 7:14 PM, Adam Kocoloski wrote:
>>
>>> On Jan 1, 2009, at 4:45 PM, Geir Magnusson Jr. wrote:
>>>
>>>> b) I should have the choice to not have it injected at all
>>>>
>>>> So why do I think this is a problem?  The 10gen appserver auto- 
>>>> injects an id field into the JSON documents that are stored in  
>>>> our database, Mongo.  Can you guess what the key is?  Yep - "_id"
>>>>
>>>> So how can I roundtrip a doc from 10gen through couch and back?   
>>>> I can't.
>>>
>>> Perhaps its worth noting that CouchDB is perfectly comfortable  
>>> with externally generated _ids.  It only injects an _id if you  
>>> create a new document without one.  Best,
>>
>> I understand that.
>>
>> I was just pointing out a real-world case where a JSON doc from  
>> "somewhere else" runs into trouble...  (and yes, the issue applies  
>> equally to the 10gen platform, when coming from "somewhere else" :)
>>
>> geir
>>
>>>
>

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Antony Blakey <an...@gmail.com>.

On 02/01/2009, at 11:23 AM, Damien Katz wrote:

> Why can't you just always stick the desired document into an body  
> field on the document? If you always do that, then you can round  
> trip without problem.

Sure, you can always do this:

{
  _id: ...
  _rev: ...
  data: {
    ... the user's document ...
  }
}

as a matter of policy in your own application.

Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

Some defeats are instalments to victory.
   -- Jacob Riis

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Damien Katz <da...@apache.org>.

Why can't you just always stick the desired document into an body  
field on the document? If you always do that, then you can round trip  
without problem.

-Damien


On Jan 1, 2009, at 7:17 PM, Geir Magnusson Jr. wrote:

>
> On Jan 1, 2009, at 7:14 PM, Adam Kocoloski wrote:
>
>> On Jan 1, 2009, at 4:45 PM, Geir Magnusson Jr. wrote:
>>
>>> b) I should have the choice to not have it injected at all
>>>
>>> So why do I think this is a problem?  The 10gen appserver auto- 
>>> injects an id field into the JSON documents that are stored in our  
>>> database, Mongo.  Can you guess what the key is?  Yep - "_id"
>>>
>>> So how can I roundtrip a doc from 10gen through couch and back?  I  
>>> can't.
>>
>> Perhaps its worth noting that CouchDB is perfectly comfortable with  
>> externally generated _ids.  It only injects an _id if you create a  
>> new document without one.  Best,
>
> I understand that.
>
> I was just pointing out a real-world case where a JSON doc from  
> "somewhere else" runs into trouble...  (and yes, the issue applies  
> equally to the 10gen platform, when coming from "somewhere else" :)
>
> geir
>
>>

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.

On Jan 1, 2009, at 7:14 PM, Adam Kocoloski wrote:

> On Jan 1, 2009, at 4:45 PM, Geir Magnusson Jr. wrote:
>
>> b) I should have the choice to not have it injected at all
>>
>> So why do I think this is a problem?  The 10gen appserver auto- 
>> injects an id field into the JSON documents that are stored in our  
>> database, Mongo.  Can you guess what the key is?  Yep - "_id"
>>
>> So how can I roundtrip a doc from 10gen through couch and back?  I  
>> can't.
>
> Perhaps its worth noting that CouchDB is perfectly comfortable with  
> externally generated _ids.  It only injects an _id if you create a  
> new document without one.  Best,

I understand that.

I was just pointing out a real-world case where a JSON doc from  
"somewhere else" runs into trouble...  (and yes, the issue applies  
equally to the 10gen platform, when coming from "somewhere else" :)

geir

>

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Adam Kocoloski <ad...@gmail.com>.

On Jan 1, 2009, at 4:45 PM, Geir Magnusson Jr. wrote:

> b) I should have the choice to not have it injected at all
>
> So why do I think this is a problem?  The 10gen appserver auto- 
> injects an id field into the JSON documents that are stored in our  
> database, Mongo.  Can you guess what the key is?  Yep - "_id"
>
> So how can I roundtrip a doc from 10gen through couch and back?  I  
> can't.

Perhaps its worth noting that CouchDB is perfectly comfortable with  
externally generated _ids.  It only injects an _id if you create a new  
document without one.  Best,

Adam

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.

On Dec 31, 2008, at 7:40 PM, Antony Blakey wrote:

>
> On 31/12/2008, at 11:29 PM, Geir Magnusson Jr. wrote:
>
>> What trouble?  I think this is *exactly* what should be done - have  
>> CouchDB store documents that are :
>>
>> {
>>   metadata : { _rev : X, _id : Y, _woogie: Z, .... anything that  
>> needs to be added in the future, like other metadata like last  
>> update date... },
>>   userdata : {  .... the document you want to store .... }
>> }
>>
>> and then offer APIs that let you :
>>
>> a) get to this document, for libraries and clients that know they  
>> are talking to Couch and want to manipulate at this level
>>
>> b) return and accept the userdocument directly, for clients that  
>> just want to consume or produce  JSON data, w/o caring about the  
>> internal housekeeping
>
> One of the issues complicating the logic of this discussion is that  
> the document id is both metadata and, conceptually, a document member.

Well, I don't understand why it has to be. Certainly it's a  
convenience, and I wonder how much of current thinking has been  
influenced by the fact that this what people are used to.

I can understand why CDB needs a unique document identifier, and it  
certainly would be nice to have the option of having it shoved into  
the user doc on creation.  But

a) I think that I should have the choice as to what that identifier  
is  (e.g.  Configure the database to inject the couch metadata _id as  
"_couchID" or whatever...)

b) I should have the choice to not have it injected at all

So why do I think this is a problem?  The 10gen appserver auto-injects  
an id field into the JSON documents that are stored in our database,  
Mongo.  Can you guess what the key is?  Yep - "_id"

So how can I roundtrip a doc from 10gen through couch and back?  I  
can't.

I've made the same argument at 10gen - that I should be able to set  
the identifier (and that it shouldn't be in the doc in the first place).

Then, I'd just have a doc with

{
    _couchID : ....
    _mongoID : ....
     ... data...
}

(if I chose to shove the ID into the doc)

> That's why, although the purest model is to have the userdata as a  
> member within a Couch document as you suggest, this doesn't look  
> that appealing:
>
> {
>  metadata: {
>    id: ...
>    rev: ...
>    ...
>  }
>  data: {
>    ... the user's document ...
>  }
> }

I can see how this isn't appealing from the perspective of current  
API's, but a rethinking of this issue (_id and _rev) also warrants a  
re-thinking of the APIs to deal with this.

E.g. an API that lets me get a) the whole doc above  b) metadata only  
c) userdata only

>
>
> Furthermore, from a scalability perspective, always having the  
> metadata when you have the document, isn't a problem - the metadata  
> is constrained.

And from what I understand, it already exists in that manner, right?   
I mean, for efficiency, I'd guess that the _id, _rev and in the  
future, other metadata (like insert date, last modificationdate...)  
would be kept outside of the doc, so that they can be read and updated  
w/o having to serialized/deserialize the whole user document.

> The reverse situation of always having the data when you have the  
> metadata, is not constrained because the data is arbitrarily large.  
> IMO this means that a solution such as this:
>
> {
>  id: ...
>  rev: ...
>  ...
>  data: {
>    ... the user's document ...
>  }
> }
>
> isn't such a good idea compared to this:
>
> {
>  _metadata: {
>    id: ...
>    rev: ...
>  }
>  ... the user's document ...
> }

That only solves the problem in that there's only one reserved magical  
key (_metadata), but I don't think that really changes anything.  You  
still need to make sure any document you want to store in couch  
doesn't have a top-level _metadata element.

And while I don't know how couch works internally, we *are* really  
only talking about how the data is returned on an API call via the  
REST API or what I assume is an internal API for the M/R View stuff.

If you had an API that let you choose all, metaonly or useronly, you  
could not be burdened with stuff you didn't want or need.

>
> Unfortunately the reserved token makes the structure non-reflexive  
> without transformation, and although that's not currently an issue,  
> I can imagine it complicating certain use-cases. It makes the system  
> more complicated to reason about.
>
> I'm struggling to objectively evaluate this model and your reflexive  
> model - given Damien's attitude to this issue, my motivation to do  
> so is somewhat depressed :/

If you could point me to an explanation of why changing this is bad,  
I'd love to catch up on the discussion.  I assume it's a technical  
reason?

geir

>
>
> Antony Blakey
> -------------
> CTO, Linkuistics Pty Ltd
> Ph: 0438 840 787
>
> Did you hear about the Buddhist who refused Novocain during a root  
> canal?
> His goal: transcend dental medication.
>
>