You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@couchdb.apache.org by Jan Lehnardt <ja...@apache.org> on 2008/12/28 20:21:05 UTC

Changing rev to _rev in view results (Was: Re: newbie question #1)

On 28 Dec 2008, at 14:32, Antony Blakey wrote:

>
> On 28/12/2008, at 11:56 PM, Paul Davis wrote:
>
>> Why "id" and "rev" are used instead of "_id" and
>> "_rev" I couldn't really tell you. I hate to say "historical reasons"
>> but I'm guessing that when Damien designed the view output he just
>> labeled then "id" and "rev" without the underscore because it's not
>> needed to distinguish from the rest of the doc.
>
> Desirable to change that (and any other inconsistencies) before a 1.0

This keeps coming up and I've been advocating this for a while now:

+1 for changing view result rows `rev` to `_rev` to avoid confusion.

CC'ing dev@c.a.o.

Cheers
Jan
--

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Noah Slater <ns...@apache.org>.

No problem, thanks for livening up the thread!

On Mon, Dec 29, 2008 at 03:55:33PM -0800, paul jobs wrote:
> sorry about this - went to wrong group by mistake
>
> On Mon, Dec 29, 2008 at 3:54 PM, paul jobs <we...@gmail.com> wrote:
>
> > Preview: *Ben Kingsley* and Penelope Cruz in Elegy | BeyondHollywood *...*
> > <http://www.beyondhollywood.com/preview-ben-kingsley-and-penelope-cruz-in-elegy/>   Jun
> > 10, 2008 *...* *Charismatic* professor DAVID KEPESH (*Ben Kingsley*)
> > glories in the pursuit of adventurous female students but never lets any
> > woman get too *...*
> >
> > http://www.beyondhollywood.com/preview-ben-kingsley-and-penelope-cruz-in-elegy/
> >  "Elegy": *Ben Kingsley* and Penélope Cruz stand out as lovers
> > <http://www.iconocast.com/A0000000019/U4/News1.htm>   Aug 8, 2008 *...*"Elegy'' (R): Director Isabel Coixet explores the relationship between a
> > *charismatic* college professor (*Ben Kingsley*) and a young woman *...*
> > http://www.iconocast.com/A0000000019/U4/News1.htm
> >

-- 
Noah Slater, http://tumbolia.org/nslater

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Noah Slater <ns...@apache.org>.

This is a compelling argument.

I am +1 on using a consistent name throughout all contexts, by the way.

Be that _id or id, doesn't really bother me, but keeping it the same does.

On Tue, Dec 30, 2008 at 12:16:19PM +1030, Antony Blakey wrote:
>
> On 30/12/2008, at 11:55 AM, Damien Katz wrote:
>
>> Your argument about consistency and rigor being compromised is
>> unqualified. I see nothing more or less consistent or rigorous about
>> the current implementation versus other proposals, the rule as is is
>> easy to follow and use, and as far as I know has no inconsistencies.
>
> Having no rule is simpler than the current rule. The API is
> unnecessarily complicated by that rule. The fact that '_id' and '_rev'
> would have underscores everywhere is not a rule, it is a matter of
> identity, which is a fundamental concept i.e. a thing has a single name.
>
> It is inconsistent because sometimes a document id is named 'id' and
> sometimes '_id'. You claim that the name is consistent modulo the
> application of the rule. I assert that is prima facie inconsistent.
>
> As far as rigor being compromised, I assert that the current scheme
> violates Ockham's Razor.
>
> Furthermore, remember that this was brought up by Geir in the context of
> a first approach to the API. I understand the rule and yet still find it
> annoying to have to code against both 'id' and '_id' depending on
> context. It has more cognitive load that having a single name for that
> type of thing.
>
> You claim that:
>
>> The current rule maybe not the most intuitive to a newbie, but it is
>> far more consistent and easier to work with then when using the deeper
>> APIs.
>
> I think that fact that it is not the most intuitive to a newbie is a
> telling point, given that I don't think the second phrase is true. How
> are the deeper APIs made simpler and more consistent by having two names
> for the id depending on context?
>
> Antony Blakey
> -------------
> CTO, Linkuistics Pty Ltd
> Ph: 0438 840 787
>
> There are two ways of constructing a software design: One way is to make
> it so simple that there are obviously no deficiencies, and the other way
> is to make it so complicated that there are no obvious deficiencies.
>   -- C. A. R. Hoare
>
>

-- 
Noah Slater, http://tumbolia.org/nslater

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Antony Blakey <an...@gmail.com>.

On 30/12/2008, at 12:16 PM, Antony Blakey wrote:

> Antony Blakey
> -------------
> CTO, Linkuistics Pty Ltd
> Ph: 0438 840 787
>
> There are two ways of constructing a software design: One way is to  
> make it so simple that there are obviously no deficiencies, and the  
> other way is to make it so complicated that there are no obvious  
> deficiencies.
>  -- C. A. R. Hoare

Random sig. How appropriate :) And how about this ...

Antony Blakey
--------------------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

What can be done with fewer [assumptions] is done in vain with more
   -- William of Ockham (ca. 1285-1349)

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Antony Blakey <an...@gmail.com>.

On 30/12/2008, at 11:55 AM, Damien Katz wrote:

> Your argument about consistency and rigor being compromised is  
> unqualified. I see nothing more or less consistent or rigorous about  
> the current implementation versus other proposals, the rule as is is  
> easy to follow and use, and as far as I know has no inconsistencies.

Having no rule is simpler than the current rule. The API is  
unnecessarily complicated by that rule. The fact that '_id' and '_rev'  
would have underscores everywhere is not a rule, it is a matter of  
identity, which is a fundamental concept i.e. a thing has a single name.

It is inconsistent because sometimes a document id is named 'id' and  
sometimes '_id'. You claim that the name is consistent modulo the  
application of the rule. I assert that is prima facie inconsistent.

As far as rigor being compromised, I assert that the current scheme  
violates Ockham's Razor.

Furthermore, remember that this was brought up by Geir in the context  
of a first approach to the API. I understand the rule and yet still  
find it annoying to have to code against both 'id' and '_id' depending  
on context. It has more cognitive load that having a single name for  
that type of thing.

You claim that:

> The current rule maybe not the most intuitive to a newbie, but it is  
> far more consistent and easier to work with then when using the  
> deeper APIs.

I think that fact that it is not the most intuitive to a newbie is a  
telling point, given that I don't think the second phrase is true. How  
are the deeper APIs made simpler and more consistent by having two  
names for the id depending on context?

Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

There are two ways of constructing a software design: One way is to  
make it so simple that there are obviously no deficiencies, and the  
other way is to make it so complicated that there are no obvious  
deficiencies.
   -- C. A. R. Hoare

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Damien Katz <da...@apache.org>.

On Dec 29, 2008, at 7:37 PM, Antony Blakey wrote:

>
> On 30/12/2008, at 11:00 AM, Noah Slater wrote:
>
>> On Tue, Dec 30, 2008 at 10:08:04AM +1030, Antony Blakey wrote:
>>> Any objection to this must be aesthetic.
>>
>> In the general case, and presuming it doesn't trump an obvious  
>> technical goal
>
> Which is the essence of this argument - IMO API consistency and  
> rigor wrt design principles is being compromised by a) arguments  
> about the cost of writing an extra underscore and b) assertions that  
> this issue has already been decided.

Actually, that last reason isn't quite complete. It's not just that's  
its decided, it's that's it's already been decided, implemented and  
working like it is for a year.

Your argument about consistency and rigor being compromised is  
unqualified. I see nothing more or less consistent or rigorous about  
the current implementation versus other proposals, the rule as is is  
easy to follow and use, and as far as I know has no inconsistencies.

-Damien

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Antony Blakey <an...@gmail.com>.

On 30/12/2008, at 11:00 AM, Noah Slater wrote:

> On Tue, Dec 30, 2008 at 10:08:04AM +1030, Antony Blakey wrote:
>> Any objection to this must be aesthetic.
>
> In the general case, and presuming it doesn't trump an obvious  
> technical goal

Which is the essence of this argument - IMO API consistency and rigor  
wrt design principles is being compromised by a) arguments about the  
cost of writing an extra underscore and b) assertions that this issue  
has already been decided.

Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

When I hear somebody sigh, 'Life is hard,' I am always tempted to ask,  
'Compared to what?'
   -- Sydney Harris

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Noah Slater <ns...@apache.org>.

On Tue, Dec 30, 2008 at 10:08:04AM +1030, Antony Blakey wrote:
> Any objection to this must be aesthetic.

In the general case, and presuming it doesn't trump an obvious technical goal, I
see no problem with pursuing æsthetics.

-- 
Noah Slater, http://tumbolia.org/nslater

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by paul jobs <we...@gmail.com>.

sorry about this - went to wrong group by mistake

On Mon, Dec 29, 2008 at 3:54 PM, paul jobs <we...@gmail.com> wrote:

> Preview: *Ben Kingsley* and Penelope Cruz in Elegy | BeyondHollywood *...*
> <http://www.beyondhollywood.com/preview-ben-kingsley-and-penelope-cruz-in-elegy/>   Jun
> 10, 2008 *...* *Charismatic* professor DAVID KEPESH (*Ben Kingsley*)
> glories in the pursuit of adventurous female students but never lets any
> woman get too *...*
>
> http://www.beyondhollywood.com/preview-ben-kingsley-and-penelope-cruz-in-elegy/
>  "Elegy": *Ben Kingsley* and Penélope Cruz stand out as lovers
> <http://www.iconocast.com/A0000000019/U4/News1.htm>   Aug 8, 2008 *...*"Elegy'' (R): Director Isabel Coixet explores the relationship between a
> *charismatic* college professor (*Ben Kingsley*) and a young woman *...*
> http://www.iconocast.com/A0000000019/U4/News1.htm
>

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by paul jobs <we...@gmail.com>.

Preview: *Ben Kingsley* and Penelope Cruz in Elegy | BeyondHollywood *...*
<http://www.beyondhollywood.com/preview-ben-kingsley-and-penelope-cruz-in-elegy/>
  Jun
10, 2008 *...* *Charismatic* professor DAVID KEPESH (*Ben Kingsley*) glories
in the pursuit of adventurous female students but never lets any woman get
too *...*
http://www.beyondhollywood.com/preview-ben-kingsley-and-penelope-cruz-in-elegy/
 "Elegy": *Ben Kingsley* and Penélope Cruz stand out as lovers
<http://www.iconocast.com/A0000000019/U4/News1.htm>   Aug 8, 2008
*...*"Elegy'' (R): Director Isabel Coixet explores the relationship
between a
*charismatic* college professor (*Ben Kingsley*) and a young woman *...*
http://www.iconocast.com/A0000000019/U4/News1.htm

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Antony Blakey <an...@gmail.com>.

On 30/12/2008, at 10:21 AM, Chris Anderson wrote:

> On Mon, Dec 29, 2008 at 3:38 PM, Antony Blakey <antony.blakey@gmail.com 
> > wrote:
>
>> How is it 'easier to work with'? How does using _id and _rev  
>> everywhere make
>> it not easy? Surely it's not the difficulty of writing code?
>
> I argued in IM with Jan, that having _id and _rev only appear with
> prefixes in the context of a document, makes for a primitive type
> system. Any JSON object with _id and _rev fields is a CouchDB
> document. Where those same values appear elsewhere they are not
> prefixed, and it's not a document.
>
> I think it's nice that code which is designed to process documents
> will hit a speed bump if it is passed view rows. Duck typing is a
> fundamental advantage of JSON, and CouchDB's makes good use of it.
> It's nice to know by looking if you have a document on hand.

A similar argument could be applied to having _id and _rev everywhere,  
on the basis that it types the atom - wherever you see _id you know  
you have a document id, and similarly with _rev. And it makes it  
easier to write generic code that deals with ids and revs, because  
such code doesn't have to be aware of the context of either it's input  
or output.

Antony Blakey
--------------------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

Isn't it enough to see that a garden is beautiful without having to  
believe that there are fairies at the bottom of it too?
   -- Douglas Adams

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Chris Anderson <jc...@gmail.com>.

On Mon, Dec 29, 2008 at 3:38 PM, Antony Blakey <an...@gmail.com> wrote:

> How is it 'easier to work with'? How does using _id and _rev everywhere make
> it not easy? Surely it's not the difficulty of writing code?

I argued in IM with Jan, that having _id and _rev only appear with
prefixes in the context of a document, makes for a primitive type
system. Any JSON object with _id and _rev fields is a CouchDB
document. Where those same values appear elsewhere they are not
prefixed, and it's not a document.

I think it's nice that code which is designed to process documents
will hit a speed bump if it is passed view rows. Duck typing is a
fundamental advantage of JSON, and CouchDB's makes good use of it.
It's nice to know by looking if you have a document on hand.

-- 
Chris Anderson
http://jchris.mfdz.com

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Chris Anderson <jc...@gmail.com>.

On Mon, Dec 29, 2008 at 4:09 PM, Antony Blakey <an...@gmail.com> wrote:
>
> And as far as writing is concerned, using name identity for values that
> appear in multiple contexts makes it easier to write polymorphic code e.g. a
> function that extracts the document id and revision from either a view row
> or the document value (possibly) contained within.
>

Which is exactly the thing I like about only having _id and _rev in
documents. My being able to identify CouchDB documents by duck type is
something I've come to enjoy in my coding. I think it makes for
simpler applications in places. Splattering the world with meaningless
"_id"s and "_rev"s only makes things more complicated.

Polymorphic identity has it's place, but you really shoudn't have an
_id / _rev pair on your hands, unless you've also got the document
itself. Breaking that contract will produce muddled thinking about
Couch's MVCC.

-- 
Chris Anderson
http://jchris.mfdz.com

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Antony Blakey <an...@gmail.com>.

On 30/12/2008, at 10:31 AM, Damien Katz wrote:

> The current rule is that every couch-domain value has a leading  
> underscore when it appears in the root of a document. Your rule  
> would mean that throughout the API, everything that has special  
> significance should have an underscore. In views (key -> _key, value  
> -> _value) in urls ( GET /db/doc?_rev="...") and design docs (views - 
> > _views, map -> _map). That rule is simpler, but I think all the  
> leading underscores are ugly (aethetics) and require more  
> unnecessary typing (efficiency) and gain you no additional behaviors  
> (functionality), and all for the sake of consistency to a slightly  
> simpler rule.

i.e. this simpler rule promotes consistency, which leads to more  
reliable understanding, over writing.

In all cases one should prefer readability/comprehensibility over  
writability. That is a principle worth following.

And as far as writing is concerned, using name identity for values  
that appear in multiple contexts makes it easier to write polymorphic  
code e.g. a function that extracts the document id and revision from  
either a view row or the document value (possibly) contained within.

Antony Blakey
--------------------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

The trouble with the world is that the stupid are cocksure and the  
intelligent are full of doubt.
   -- Bertrand Russell

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Damien Katz <da...@apache.org>.

On Dec 29, 2008, at 6:38 PM, Antony Blakey wrote:

>
> On 30/12/2008, at 12:51 AM, Damien Katz wrote:
>
>> The current rule maybe not the most intuitive to a newbie, but it  
>> is far more consistent and easier to work with then when using the  
>> deeper APIs.
>
> How is it 'far more consistent'. What measure of consistency are you  
> using? Name identity is more cognitively fundamental than name  
> contextuality. The consistency guaranteed by the asserting that a  
> name is a name is a name is of a more fundamental form than the  
> application of a different rule.
>
> How is it 'easier to work with'? How does using _id and _rev  
> everywhere make it not easy? Surely it's not the difficulty of  
> writing code?
>
>> The only 2 other workable solutions I see is to either stuff  
>> everything special into a _meta structure or only use HTTP headers  
>> for all CouchDB meta information. But after having spent much time  
>> thinking about this issue, I think the current rule is the better  
>> compromise.
>
> A simple rule is that every couch-domain value has a leading  
> underscore. All technical issues disappear. Consistency ensues. Any  
> objection to this must be aesthetic.

The current rule is that every couch-domain value has a leading  
underscore when it appears in the root of a document. Your rule would  
mean that throughout the API, everything that has special significance  
should have an underscore. In views (key -> _key, value -> _value) in  
urls ( GET /db/doc?_rev="...") and design docs (views -> _views, map - 
 > _map). That rule is simpler, but I think all the leading  
underscores are ugly (aethetics) and require more unnecessary typing  
(efficiency) and gain you no additional behaviors (functionality), and  
all for the sake of consistency to a slightly simpler rule.

-Damien

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Antony Blakey <an...@gmail.com>.

On 30/12/2008, at 12:51 AM, Damien Katz wrote:

> The current rule maybe not the most intuitive to a newbie, but it is  
> far more consistent and easier to work with then when using the  
> deeper APIs.

How is it 'far more consistent'. What measure of consistency are you  
using? Name identity is more cognitively fundamental than name  
contextuality. The consistency guaranteed by the asserting that a name  
is a name is a name is of a more fundamental form than the application  
of a different rule.

How is it 'easier to work with'? How does using _id and _rev  
everywhere make it not easy? Surely it's not the difficulty of writing  
code?

> The only 2 other workable solutions I see is to either stuff  
> everything special into a _meta structure or only use HTTP headers  
> for all CouchDB meta information. But after having spent much time  
> thinking about this issue, I think the current rule is the better  
> compromise.

A simple rule is that every couch-domain value has a leading  
underscore. All technical issues disappear. Consistency ensues. Any  
objection to this must be aesthetic.

Antony Blakey
--------------------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

Isn't it enough to see that a garden is beautiful without having to  
believe that there are fairies at the bottom of it too?
   -- Douglas Adams

Re: Decision making process (Re: Changing rev to _rev in view results (Was: Re: newbie question #1))

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.

On Jan 2, 2009, at 7:37 AM, Chris Anderson wrote:

> On Fri, Jan 2, 2009 at 4:16 AM, Geir Magnusson Jr. <ge...@pobox.com>  
> wrote:
>>
>> On Jan 2, 2009, at 12:33 AM, Antony Blakey wrote:
>>>
>>> It's never been clear to me that there is a process for voting - the
>>> decision making process within the commit group seems opaque.
>>
>> How can that be?  I assume that all decisions are made in public on  
>> the dev@
>> mailing list.
>>
>
> The committers have a history of deferring to Damien (especially on
> deeply technical matters like the document identity model). It's fair
> to say that most of what we do is bug fixes and the like. When we have
> a new feature or module under development, we like to run the code by
> Damien before we commit it. He understands CouchDB inside and out, and
> he's pretty good at seeing how an API detail or caching property will
> effect the big picture of how people use CouchDB.

That's perfectly reasonable if it's on the dev@ list.

>
>
> There have been votes on the dev list before, but they are rare
> because we so often move with consensus.

That's also perfectly reasonable, and in fact, how I personally like  
to participate - it seems that if it comes down to a vote, the  
consensus mode has failed.

geir

>
>
> -- 
> Chris Anderson
> http://jchris.mfdz.com

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Noah Slater <ns...@apache.org>.

On Fri, Jan 02, 2009 at 04:03:50PM +1030, Antony Blakey wrote:
> It's never been clear to me that there is a process for voting - the decision
> making process within the commit group seems opaque.

If a patch was submitted, and it was in question, we could have a standard ASF
vote to decide if we wanted to accept it or not.

> I think a change to the API could be decided without reference to the code
> implementing that change. In fact, IMO the API *should* be considered
> separately from the code implementing that change. Otherwise APIs will tend to
> be decided not on the basis of design, but on the amount of effort some person
> is prepared to spend to demonstrate it, and hence code inertia, often
> resulting in expedient solutions. This means that good, but expensive ideas,
> can be lost.

This implies that expedient solutions are necessarily bad. I actually think
there is a trade-off between conceptually good ideas, and practically good
ideas. One of the things that separates those is working code! That is to say,
it's easy to come up with ideas, but harder to execute them -- and I think both
of those factors are important when you're discussing the way forward.

-- 
Noah Slater, http://tumbolia.org/nslater

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by "ara.t.howard" <ar...@gmail.com>.

On Jan 4, 2009, at 9:35 AM, Michael Fellinger wrote:

> For what it's worth, I'd love to see a separation of data and data
> about data, and I'd also like to propose a change to map functions:
>
> function(doc, meta){ emit(doc.title, [meta.id, meta.ref]); }
> function(doc){ emit(doc.title); }
>
> That way people who are complaining about having to type doc.doc.title
> can have their peace of mind as well.
>
> I might see the whole issue very limited, but it makes absolutely the
> most sense to me, everybody can stop worrying about which data is
> allowed in the data part and the metadata can grow in any direction.
> Nobody is annoyed by prefixing _ anymore (symetric API) and there is
> little technical need anymore for validating documents (as long as
> they can be serialized to valid JSON) before putting them into the db.
>
> ^ manveru



i think those of us who have maintained a lot of code ourselves see  
the wisdom in this.  having rules that are 100% consistent lifts a  
massive mental weight for anyone connected to the code.

keys.each do |key|
   unless key =~ %r/^_/
     ...


just gives me hives to think about.

a @ http://codeforpeople.com/
--
we can deny everything, except that we have the possibility of being  
better. simply reflect on that.
h.h. the 14th dalai lama

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.

exactly :)

On Jan 4, 2009, at 11:35 AM, Michael Fellinger wrote:

> On Fri, Jan 2, 2009 at 9:30 PM, Robert Dionne <bo...@gmail.com>  
> wrote:
>> When it comes to design I think there are always tradeoffs. This is  
>> where
>> intuition and experience count. In my opinion separating metadata  
>> from the
>> user's data is a more complex approach. It creates two parts to the
>> document, they have to be handled separately and it creates the  
>> need for two
>> kinds of API calls for the two types of data.  It seems like a good
>> approach, however it's very easy to look at an existing  
>> implementation and
>> see how things "ought" to be done.
>>
>> The current implementation has a nice simplicity to it that I would  
>> not
>> readily advocate changing. My first impression is that it reminded  
>> me of
>> Berkeley DB on steroids. The convention governing the use of the  
>> _id is not
>> that hard to deal with and it doesn't prevent one from handling  
>> JSON docs
>> that come from elsewhere. It seems that converting data from one  
>> database
>> system to another always involves some transformation.
>
> For what it's worth, I'd love to see a separation of data and data
> about data, and I'd also like to propose a change to map functions:
>
> function(doc, meta){ emit(doc.title, [meta.id, meta.ref]); }
> function(doc){ emit(doc.title); }
>
> That way people who are complaining about having to type doc.doc.title
> can have their peace of mind as well.
>
> I might see the whole issue very limited, but it makes absolutely the
> most sense to me, everybody can stop worrying about which data is
> allowed in the data part and the metadata can grow in any direction.
> Nobody is annoyed by prefixing _ anymore (symetric API) and there is
> little technical need anymore for validating documents (as long as
> they can be serialized to valid JSON) before putting them into the db.
>
> ^ manveru
>
>> This discussion reminds me of Perlis' epigram(#15) that everything  
>> should be
>> built top down, except the first time.
>>
>>
>> On Jan 2, 2009, at 12:33 AM, Antony Blakey wrote:
>>
>>>
>>> On 02/01/2009, at 2:17 PM, Noah Slater wrote:
>>>
>>>> I appreciate you're frustrated with the current situation Antony,  
>>>> but I
>>>> think
>>>> it's unfair for you to be claiming any kind of consensus without  
>>>> a vote.
>>>
>>> That post wasn't meant to be a criticism. Apologies if it felt  
>>> like it
>>> was.
>>>
>>> There isn't a clear consensus in this thread, which to my mind  
>>> reflects
>>> the fact that there are trade-offs that don't have objective  
>>> evaluation
>>> measures.
>>>
>>> I fully support the idea that a product should reflect the vision  
>>> and
>>> opinion of a very small group. Abstracting from my preference for  
>>> a more
>>> robustly theoretical approach to API desig, the holistically best  
>>> result is
>>> likely to arise from this model. So I don't e.g. mean 'gatekeeper'  
>>> in a
>>> negative way.
>>>
>>>> I would
>>>> be interested in seeing a patch, explanation, and vote. I've  
>>>> already
>>>> expressed
>>>> my agreement with many of the points you've raised, and I'm not  
>>>> the only
>>>> one.
>>>
>>> I was only referring to a lack of expressed support for a fully  
>>> reflexive
>>> model.
>>>
>>> It's never been clear to me that there is a process for voting - the
>>> decision making process within the commit group seems opaque.
>>>
>>>> It's pretty pointless for us to keep sending emails over proposed  
>>>> changes
>>>> to the
>>>> code without actually seeing the changes.
>>>
>>> I think a change to the API could be decided without reference to  
>>> the code
>>> implementing that change. In fact, IMO the API *should* be  
>>> considered
>>> separately from the code implementing that change. Otherwise APIs  
>>> will tend
>>> to be decided not on the basis of design, but on the amount of  
>>> effort some
>>> person is prepared to spend to demonstrate it, and hence code  
>>> inertia, often
>>> resulting in expedient solutions. This means that good, but  
>>> expensive ideas,
>>> can be lost.
>>>
>>> The models under discussion have evolved from simple name identity  
>>> by
>>> using '_id' and '_rev' everywhere, to a '_meta' wrapper, to Geir's  
>>> fully
>>> reflexive model.
>>>
>>> So I'd prefer to get buy-in to a model or principles, at which point
>>> anyone could implement it. That's what I tried to do with the  
>>> change to the
>>> FS layout to support i18n, the committable implementation of which  
>>> is my
>>> focus right now.

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Michael Fellinger <m....@gmail.com>.

On Fri, Jan 2, 2009 at 9:30 PM, Robert Dionne <bo...@gmail.com> wrote:
> When it comes to design I think there are always tradeoffs. This is where
> intuition and experience count. In my opinion separating metadata from the
> user's data is a more complex approach. It creates two parts to the
> document, they have to be handled separately and it creates the need for two
> kinds of API calls for the two types of data.  It seems like a good
> approach, however it's very easy to look at an existing implementation and
> see how things "ought" to be done.
>
> The current implementation has a nice simplicity to it that I would not
> readily advocate changing. My first impression is that it reminded me of
> Berkeley DB on steroids. The convention governing the use of the _id is not
> that hard to deal with and it doesn't prevent one from handling JSON docs
> that come from elsewhere. It seems that converting data from one database
> system to another always involves some transformation.

For what it's worth, I'd love to see a separation of data and data
about data, and I'd also like to propose a change to map functions:

function(doc, meta){ emit(doc.title, [meta.id, meta.ref]); }
function(doc){ emit(doc.title); }

That way people who are complaining about having to type doc.doc.title
can have their peace of mind as well.

I might see the whole issue very limited, but it makes absolutely the
most sense to me, everybody can stop worrying about which data is
allowed in the data part and the metadata can grow in any direction.
Nobody is annoyed by prefixing _ anymore (symetric API) and there is
little technical need anymore for validating documents (as long as
they can be serialized to valid JSON) before putting them into the db.

^ manveru

> This discussion reminds me of Perlis' epigram(#15) that everything should be
> built top down, except the first time.
>
>
> On Jan 2, 2009, at 12:33 AM, Antony Blakey wrote:
>
>>
>> On 02/01/2009, at 2:17 PM, Noah Slater wrote:
>>
>>> I appreciate you're frustrated with the current situation Antony, but I
>>> think
>>> it's unfair for you to be claiming any kind of consensus without a vote.
>>
>> That post wasn't meant to be a criticism. Apologies if it felt like it
>> was.
>>
>> There isn't a clear consensus in this thread, which to my mind reflects
>> the fact that there are trade-offs that don't have objective evaluation
>> measures.
>>
>> I fully support the idea that a product should reflect the vision and
>> opinion of a very small group. Abstracting from my preference for a more
>> robustly theoretical approach to API desig, the holistically best result is
>> likely to arise from this model. So I don't e.g. mean 'gatekeeper' in a
>> negative way.
>>
>>> I would
>>> be interested in seeing a patch, explanation, and vote. I've already
>>> expressed
>>> my agreement with many of the points you've raised, and I'm not the only
>>> one.
>>
>> I was only referring to a lack of expressed support for a fully reflexive
>> model.
>>
>> It's never been clear to me that there is a process for voting - the
>> decision making process within the commit group seems opaque.
>>
>>> It's pretty pointless for us to keep sending emails over proposed changes
>>> to the
>>> code without actually seeing the changes.
>>
>> I think a change to the API could be decided without reference to the code
>> implementing that change. In fact, IMO the API *should* be considered
>> separately from the code implementing that change. Otherwise APIs will tend
>> to be decided not on the basis of design, but on the amount of effort some
>> person is prepared to spend to demonstrate it, and hence code inertia, often
>> resulting in expedient solutions. This means that good, but expensive ideas,
>> can be lost.
>>
>> The models under discussion have evolved from simple name identity by
>> using '_id' and '_rev' everywhere, to a '_meta' wrapper, to Geir's fully
>> reflexive model.
>>
>> So I'd prefer to get buy-in to a model or principles, at which point
>> anyone could implement it. That's what I tried to do with the change to the
>> FS layout to support i18n, the committable implementation of which is my
>> focus right now.

Re: Decision making process (Re: Changing rev to _rev in view results (Was: Re: newbie question #1))

Posted by Noah Slater <ns...@apache.org>.

On Sat, Jan 03, 2009 at 12:27:20AM +1030, Antony Blakey wrote:
> Is this voting/decision-making process visible?

Yes, the formal decision making process is visible. However, as a group of
people who work closely together we communicate in lots of other ways, be that
public IRC, private IRC, private email, Twitter, or telephone calls. It is
important that we discuss everything official on the mailing list as per The
Apache Way, but it is a natural consequence of our working together that we
become informally aware of each other's position on various topics. Short of
refusing to discuss CouchDB "in real life," I see this as unavoidable.

> If I'm trying to effect a change, how can I judge what would get it over the
> line i.e. who objects and for what reasons?

I guess the best way would be to do a proper proposal and request a vote. This
vote wouldn't be anything official, but it would be an informal way of
soliciting definite feedback from the community and the committers.

> How do I ensure that my patch isn't stuck in development hell while the PMC
> (maybe) waits for consensus to (maybe never) emerge?

If you submitted a patch to the mailing list and requested a vote.

> If I have a change in mind, I would prefer where possible to get at
> least a majority of the PMC on side before doing (possibly a lot of)
> work.

Makes sense. I guess that's where a proper proposal would come in.

> It looks like it is only when a patch is up for approval that any real
> decision will be made i.e. I can't prompt a decision on the metadata thread
> without doing all the work, which may be pointless. That makes a proposal very
> expensive, especially if the PMC doesn't operate in a predictable fashion.

Predictable fashion?

> That in turn makes it less likely that people will contribute, and seem to be
> a strategic weakness.

I'm not sure I follow your reasoning at all.

The way I see it is that if someone suggests something and we all love it, there
is a clear consensus and we would go forward with that. The reverse is also
true. In this particular case it is clear to me that we do not have consensus. I
think part of that may be due to there being no concrete proposal, everyone
seems to be suggesting different things!

We are only at this juncture because nobody can agree on the best way
forwards. All I am saying is that maybe a proper proposal would be a nice idea,
something we can all say "+1" on, or not as the case may be. If you asked me now
what my thoughts were on the whole topic I would tell you that I don't know,
because I'm not even sure what the agreed proposal is, let alone the agreed
consensus is.

> IMO an improvement to this process would be a mechanism to submit a proposal
> to the *PMC*, to get some definitive feedback, the contract being that I'll do
> the proposal and the implementation in return for the PMC being prepared to
> give a provisional indication that a) such work would not be in vain; or b)
> the proposal won't be pursued; or c) more work is needed, but it's not out of
> the question; and/or d) some comment about what would need to be addressed to
> move forward. All of which is to say: is the PMC reified in some way that
> doesn't involve canvassing each member individually?

This process is already in place! Heh. All anyone has to do is write up a
concrete proposal of the changes and submit it to the mailing list requesting a
vote. We can move forward from there. If there are comments about the proposal,
the sponsor could collect them and draft a new proposal. Rinse and repeat.

-- 
Noah Slater, http://tumbolia.org/nslater

Re: Decision making process (Re: Changing rev to _rev in view results (Was: Re: newbie question #1))

Posted by Antony Blakey <an...@gmail.com>.

On 03/01/2009, at 12:03 AM, Noah Slater wrote:

> It's worth noting that we may never make a decision, or the decision  
> might take
> a long time. Looking back at the newline proposal, we never decided  
> consensus
> was to reject it outright, but we never resolved it the other way  
> either. One of
> us could move to call a vote, but we all feel its better to wait for  
> consensus.

Is this voting/decision-making process visible? If I'm trying to  
effect a change, how can I judge what would get it over the line i.e.  
who objects and for what reasons? How do I ensure that my patch isn't  
stuck in development hell while the PMC (maybe) waits for consensus to  
(maybe never) emerge?

If I have a change in mind, I would prefer where possible to get at  
least a majority of the PMC on side before doing (possibly a lot of)  
work. It looks like it is only when a patch is up for approval that  
any real decision will be made i.e. I can't prompt a decision on the  
metadata thread without doing all the work, which may be pointless.  
That makes a proposal very expensive, especially if the PMC doesn't  
operate in a predictable fashion. That in turn makes it less likely  
that people will contribute, and seem to be a strategic weakness.

IMO an improvement to this process would be a mechanism to submit a  
proposal to the *PMC*, to get some definitive feedback, the contract  
being that I'll do the proposal and the implementation in return for  
the PMC being prepared to give a provisional indication that a) such  
work would not be in vain; or b) the proposal won't be pursued; or c)  
more work is needed, but it's not out of the question; and/or d) some  
comment about what would need to be addressed to move forward. All of  
which is to say: is the PMC reified in some way that doesn't involve  
canvassing each member individually?

Antony Blakey
--------------------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

You can't just ask customers what they want and then try to give that  
to them. By the time you get it built, they'll want something new.
   -- Steve Jobs

Re: Decision making process (Re: Changing rev to _rev in view results (Was: Re: newbie question #1))

Posted by Noah Slater <ns...@apache.org>.

On Fri, Jan 02, 2009 at 11:19:18PM +1030, Antony Blakey wrote:
> That would confirm my presumption that it's probably a waste of time pushing
> something that Damien's firmly against, as opposed to expecting to vote on it.

I'm not too sure this is a wise approach. Chris is correct in that we've opted
for following consensus in the past, but that doesn't mean that whatever Damien
says we all agree with. There's still the issue of the JSON newline which has
the committers split right down the middle!

I see that this issue can take one of three possible routes:

  * the PMC decide that consensus is to reject the proposal

  * the PMC decide that opinion is spit and hold an ASF style vote

  * the PMC decide that consensus is to accept the proposal

The PMC comprises me, Chris, Chris, Jan, and Damien.

It's worth noting that we may never make a decision, or the decision might take
a long time. Looking back at the newline proposal, we never decided consensus
was to reject it outright, but we never resolved it the other way either. One of
us could move to call a vote, but we all feel its better to wait for consensus.

-- 
Noah Slater, http://tumbolia.org/nslater

Re: Decision making process (Re: Changing rev to _rev in view results (Was: Re: newbie question #1))

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.

On Jan 2, 2009, at 7:49 AM, Antony Blakey wrote:

>
> On 02/01/2009, at 11:07 PM, Chris Anderson wrote:
>
>> On Fri, Jan 2, 2009 at 4:16 AM, Geir Magnusson Jr. <ge...@pobox.com>  
>> wrote:
>>>
>>> On Jan 2, 2009, at 12:33 AM, Antony Blakey wrote:
>>>>
>>>> It's never been clear to me that there is a process for voting -  
>>>> the
>>>> decision making process within the commit group seems opaque.
>>>
>>> How can that be?  I assume that all decisions are made in public  
>>> on the dev@
>>> mailing list.
>>>
>>
>> The committers have a history of deferring to Damien (especially on
>> deeply technical matters like the document identity model). It's fair
>> to say that most of what we do is bug fixes and the like. When we  
>> have
>> a new feature or module under development, we like to run the code by
>> Damien before we commit it. He understands CouchDB inside and out,  
>> and
>> he's pretty good at seeing how an API detail or caching property will
>> effect the big picture of how people use CouchDB.
>
> That would confirm my presumption that it's probably a waste of time  
> pushing something that Damien's firmly against, as opposed to  
> expecting to vote on it.

That would be the case of any committer - anyone can veto a code  
change, including one by Damien  (Technically it's a PMC member, but  
some communities include all committers in this, which IMO is a good  
way)

However, the fact that people defer to Damien is perfectly  
reasonable.  (But remember the goal of an ASF community - you want to  
make it strong enough that the original founders/motivators can step  
back)

>
>
>> There have been votes on the dev list before, but they are rare
>> because we so often move with consensus.
>
> I'm merely curious bout this, but now that Couch is formally an  
> Apache project, is there some Apace mandated consensus-driven  
> decision making approach, or do they accept whatever model comes in  
> - I'm wondering if the ASF brand might 'mean' something in that  
> sense. And I've just noticed that Geir is on the ASF board - he  
> should know if anyone does!

I'm not here as anything but an interested user that knows something  
about the ASF, and I see in a followup that you found something on the  
website :)  All is well.

geir

Re: Decision making process (Re: Changing rev to _rev in view results (Was: Re: newbie question #1))

Posted by Antony Blakey <an...@gmail.com>.

On 02/01/2009, at 11:19 PM, Antony Blakey wrote:

> I'm merely curious bout this, but now that Couch is formally an  
> Apache project, is there some Apace mandated consensus-driven  
> decision making approach, or do they accept whatever model comes in  
> - I'm wondering if the ASF brand might 'mean' something in that  
> sense. And I've just noticed that Geir is on the ASF board - he  
> should know if anyone does!

Ahh yes, I should have read the Apache site :/ I'll slap my own wrist.

Antony Blakey
--------------------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

Reflecting on W.H. Auden's contemplation of 'necessary murders' in the  
Spanish Civil War, George Orwell wrote that such amorality was only  
really possible, 'if you are the kind of person who is always  
somewhere else when the trigger is pulled'.
   -- John Birmingham, "Appeasing Jakarta"

Re: Decision making process (Re: Changing rev to _rev in view results (Was: Re: newbie question #1))

Posted by Antony Blakey <an...@gmail.com>.

On 02/01/2009, at 11:07 PM, Chris Anderson wrote:

> On Fri, Jan 2, 2009 at 4:16 AM, Geir Magnusson Jr. <ge...@pobox.com>  
> wrote:
>>
>> On Jan 2, 2009, at 12:33 AM, Antony Blakey wrote:
>>>
>>> It's never been clear to me that there is a process for voting - the
>>> decision making process within the commit group seems opaque.
>>
>> How can that be?  I assume that all decisions are made in public on  
>> the dev@
>> mailing list.
>>
>
> The committers have a history of deferring to Damien (especially on
> deeply technical matters like the document identity model). It's fair
> to say that most of what we do is bug fixes and the like. When we have
> a new feature or module under development, we like to run the code by
> Damien before we commit it. He understands CouchDB inside and out, and
> he's pretty good at seeing how an API detail or caching property will
> effect the big picture of how people use CouchDB.

That would confirm my presumption that it's probably a waste of time  
pushing something that Damien's firmly against, as opposed to  
expecting to vote on it.

> There have been votes on the dev list before, but they are rare
> because we so often move with consensus.

I'm merely curious bout this, but now that Couch is formally an Apache  
project, is there some Apace mandated consensus-driven decision making  
approach, or do they accept whatever model comes in - I'm wondering if  
the ASF brand might 'mean' something in that sense. And I've just  
noticed that Geir is on the ASF board - he should know if anyone does!

Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

The intuitive mind is a sacred gift and the rational mind is a  
faithful servant. We have created a society that honours the servant  
and has forgotten the gift.
   -- Albert Einstein

Re: Decision making process (Re: Changing rev to _rev in view results (Was: Re: newbie question #1))

Posted by Chris Anderson <jc...@gmail.com>.

On Fri, Jan 2, 2009 at 4:16 AM, Geir Magnusson Jr. <ge...@pobox.com> wrote:
>
> On Jan 2, 2009, at 12:33 AM, Antony Blakey wrote:
>>
>> It's never been clear to me that there is a process for voting - the
>> decision making process within the commit group seems opaque.
>
> How can that be?  I assume that all decisions are made in public on the dev@
> mailing list.
>

The committers have a history of deferring to Damien (especially on
deeply technical matters like the document identity model). It's fair
to say that most of what we do is bug fixes and the like. When we have
a new feature or module under development, we like to run the code by
Damien before we commit it. He understands CouchDB inside and out, and
he's pretty good at seeing how an API detail or caching property will
effect the big picture of how people use CouchDB.

There have been votes on the dev list before, but they are rare
because we so often move with consensus.

-- 
Chris Anderson
http://jchris.mfdz.com

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Antony Blakey <an...@gmail.com>.

On 02/01/2009, at 10:51 PM, Geir Magnusson Jr. wrote:

> But I am a little worried that if the "project lead" and  
> "gatekeepers" are even *thought* by active members of the community  
> to be dead-set against it that I would be wasting time.

To be blunt, and this is my impression only: I thought Damien was  
strongly against changing it, and got the impression that there was  
some significant degree of deferral to his opinion, understandably. It  
is difficult to get a handle on the social and procedural dynamic in  
this situation.

I really don't want any political/social meta-issue to get out of  
hand, sorry if I lead it in that direction.

> I think it's an important issue (clearly or I wouldn't be spending  
> so much vacation time discussing it), but it's not a make-or-break  
> issue for me.

Ditto.

Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

Nothing is really work unless you would rather be doing something else.
   -- J. M. Barre

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.

I actually don't mind digging in and spending the time showing what I  
mean with code (although the barrier here is much higher since I don't  
know erlang at all...)

But I am a little worried that if the "project lead" and "gatekeepers"  
are even *thought* by active members of the community to be dead-set  
against it that I would be wasting time.

I think it's an important issue (clearly or I wouldn't be spending so  
much vacation time discussing it), but it's not a make-or-break issue  
for me.

geir


On Jan 2, 2009, at 7:17 AM, Noah Slater wrote:

> Not really, my point stands. Show the code, then we vote. :)
>
> On the other hand, consensus building can't harm either.
>
> On Fri, Jan 02, 2009 at 07:14:26AM -0500, Geir Magnusson Jr. wrote:
>> Maybe the tradition should be of an Apache project? :)
>>
>> From my limited interaction w/ the Linux kernel community, it's a  
>> very
>> different beastie...
>>
>> geir
>>
>> On Jan 1, 2009, at 10:47 PM, Noah Slater wrote:
>>
>>> On Fri, Jan 02, 2009 at 09:15:21AM +1030, Antony Blakey wrote:
>>>> No. The primary reason is "why change - the current mechanism has
>>>> worked for a
>>>> year". Damien (project lead) doesn't regard change as necessary,  
>>>> and
>>>> a
>>>> significant change to support top-level reflexivity (which is your
>>>> primary
>>>> thrust) doesn't have support from the other gatekeepers. There is
>>>> some support
>>>> for name identity, although I suspect not enough to prompt a  
>>>> change.
>>>
>>> I appreciate you're frustrated with the current situation Antony,  
>>> but I
>>> think
>>> it's unfair for you to be claiming any kind of consensus without a
>>> vote. I would
>>> be interested in seeing a patch, explanation, and vote. I've already
>>> expressed
>>> my agreement with many of the points you've raised, and I'm not the
>>> only one.
>>>
>>> It's pretty pointless for us to keep sending emails over proposed
>>> changes to the
>>> code without actually seeing the changes. So, in the tradition of  
>>> the
>>> Linux
>>> kernel, show the code and let's have a vote!
>>>
>>
>
> -- 
> Noah Slater, http://tumbolia.org/nslater

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Noah Slater <ns...@apache.org>.

Not really, my point stands. Show the code, then we vote. :)

On the other hand, consensus building can't harm either.

On Fri, Jan 02, 2009 at 07:14:26AM -0500, Geir Magnusson Jr. wrote:
> Maybe the tradition should be of an Apache project? :)
>
> From my limited interaction w/ the Linux kernel community, it's a very
> different beastie...
>
> geir
>
> On Jan 1, 2009, at 10:47 PM, Noah Slater wrote:
>
>> On Fri, Jan 02, 2009 at 09:15:21AM +1030, Antony Blakey wrote:
>>> No. The primary reason is "why change - the current mechanism has
>>> worked for a
>>> year". Damien (project lead) doesn't regard change as necessary, and
>>> a
>>> significant change to support top-level reflexivity (which is your
>>> primary
>>> thrust) doesn't have support from the other gatekeepers. There is
>>> some support
>>> for name identity, although I suspect not enough to prompt a change.
>>
>> I appreciate you're frustrated with the current situation Antony, but I
>> think
>> it's unfair for you to be claiming any kind of consensus without a
>> vote. I would
>> be interested in seeing a patch, explanation, and vote. I've already
>> expressed
>> my agreement with many of the points you've raised, and I'm not the
>> only one.
>>
>> It's pretty pointless for us to keep sending emails over proposed
>> changes to the
>> code without actually seeing the changes. So, in the tradition of the
>> Linux
>> kernel, show the code and let's have a vote!
>>
>

-- 
Noah Slater, http://tumbolia.org/nslater

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Antony Blakey <an...@gmail.com>.

On 31/12/2008, at 11:07 PM, Robert Dionne wrote:

> I understand the issue. I noted the use of _id versus id myself and  
> wasn't that put off by it, just seemed a quirk of the implementation.

Implementations quirks shouldn't be materialized in the API. They're a  
red flag that says "this wasn't designed", and IMO are always regretted.

But these aren't implementation quirks - Damien did think about this  
issue, I just think he weighted the competing issues wrongly.

> I realize you've likely written a lot of code a this point and have  
> run into reuse issues. It's not unusual to have different names for  
> the same thing if the context is different.

It's not unusual for APIs to be not as good as they could be,  
sometimes because of lack of thought, sometimes lack of knowledge.

I guess one either regards the fundamental nature of name identity to  
be important, or not. No value judgement intended.

>> 5. Use a '_meta' wrapper for the metadata. I don't see any  
>> technical cons, and IMO is by far the cleanest model. Name identity  
>> is preserved, it's arbitrarily extensible without scalability  
>> concerns, and is structural rather than lexical.
>
> It is clearly cleaner and has it's advantages, however I have to  
> agree with an earlier poster; "Putting them in a _meta group might  
> encourage aggregation and manipulation of the bookkeeping metadata  
> separately from the document, which to me sounds like a recipe for  
> trouble."

It might encourage, or might not. Orthogonally, however, what's wrong  
with treating and manipulating metadata separately from the document?  
To give a concrete example, I manipulate _id and _rev separately from  
the 'rest' of the document now, because _id + _rev uniquely identify  
an immutable value, and that value (the document) can be cached, keyed  
on _id + _rev, with no concern about cache invalidation. It's  
fantastic. Chris Anderson was surprised that I would want _id and _rev  
in a view that doesn't supply the document, and then allow my users to  
write to the document on that basis. That's an example of where  
someone (with much more Couch expertise than me) was guided by their  
own context and uses, to not see a particular way of using the system.  
That's no disrespect to Chris, but rather an indication that the  
application of general principals are a better way to do API design  
than the extrapolation of one's own experience or use cases. My goal  
is a design methodology driven as much as possible by an algebra of  
design principles. This has been done before in e.g. in programming  
language design, of which API design is a form. Sorry, pet topic of  
mine :/

IMO there are good reasons to allow the document and it's metadata to  
be separated cleanly, and even if I *couldn't* think of any use cases,  
the idea that one should discourage it without having rock solid  
technical reasons for saying that it *must* not be done, I think is  
wrong.

> This would be a more complex design than the current use of the  
> underscore at the top level of documents and would definitely  
> encourage a quite different implementation. I don't know the  
> internals enough yet to comment on this. The code there to date is  
> remarkably terse for what it does but this may just reflect the use  
> of Erlang.

Even if it's a bit more work for the implementation, I think the  
cleanliness that results is worth it. An implementation is built once,  
and used many times, and it's the uses/users of the system that should  
have priority.

BTW: That's not a criticism of the gatekeepers - I know they agree  
with this principle, because they've agreed to accept a more  
complicated file system layout implementation in order that database  
and design document/view names can be arbitrary.

Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

The fact that an opinion has been widely held is no evidence whatever  
that it is not utterly absurd.
   -- Bertrand Russell

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.

Maybe the tradition should be of an Apache project? :)

 From my limited interaction w/ the Linux kernel community, it's a  
very different beastie...

geir

On Jan 1, 2009, at 10:47 PM, Noah Slater wrote:

> On Fri, Jan 02, 2009 at 09:15:21AM +1030, Antony Blakey wrote:
>> No. The primary reason is "why change - the current mechanism has  
>> worked for a
>> year". Damien (project lead) doesn't regard change as necessary,  
>> and a
>> significant change to support top-level reflexivity (which is your  
>> primary
>> thrust) doesn't have support from the other gatekeepers. There is  
>> some support
>> for name identity, although I suspect not enough to prompt a change.
>
> I appreciate you're frustrated with the current situation Antony,  
> but I think
> it's unfair for you to be claiming any kind of consensus without a  
> vote. I would
> be interested in seeing a patch, explanation, and vote. I've already  
> expressed
> my agreement with many of the points you've raised, and I'm not the  
> only one.
>
> It's pretty pointless for us to keep sending emails over proposed  
> changes to the
> code without actually seeing the changes. So, in the tradition of  
> the Linux
> kernel, show the code and let's have a vote!
>
> -- 
> Noah Slater, http://tumbolia.org/nslater

Decision making process (Re: Changing rev to _rev in view results (Was: Re: newbie question #1))

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.

On Jan 2, 2009, at 12:33 AM, Antony Blakey wrote:
>
> It's never been clear to me that there is a process for voting - the  
> decision making process within the commit group seems opaque.

How can that be?  I assume that all decisions are made in public on  
the dev@ mailing list.

geir

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Robert Dionne <bo...@gmail.com>.

When it comes to design I think there are always tradeoffs. This is  
where intuition and experience count. In my opinion separating  
metadata from the user's data is a more complex approach. It creates  
two parts to the document, they have to be handled separately and it  
creates the need for two kinds of API calls for the two types of  
data.  It seems like a good approach, however it's very easy to look  
at an existing implementation and see how things "ought" to be done.

The current implementation has a nice simplicity to it that I would  
not readily advocate changing. My first impression is that it  
reminded me of Berkeley DB on steroids. The convention governing the  
use of the _id is not that hard to deal with and it doesn't prevent  
one from handling JSON docs that come from elsewhere. It seems that  
converting data from one database system to another always involves  
some transformation.

This discussion reminds me of Perlis' epigram(#15) that everything  
should be built top down, except the first time.


On Jan 2, 2009, at 12:33 AM, Antony Blakey wrote:

>
> On 02/01/2009, at 2:17 PM, Noah Slater wrote:
>
>> I appreciate you're frustrated with the current situation Antony,  
>> but I think
>> it's unfair for you to be claiming any kind of consensus without a  
>> vote.
>
> That post wasn't meant to be a criticism. Apologies if it felt like  
> it was.
>
> There isn't a clear consensus in this thread, which to my mind  
> reflects the fact that there are trade-offs that don't have  
> objective evaluation measures.
>
> I fully support the idea that a product should reflect the vision  
> and opinion of a very small group. Abstracting from my preference  
> for a more robustly theoretical approach to API desig, the  
> holistically best result is likely to arise from this model. So I  
> don't e.g. mean 'gatekeeper' in a negative way.
>
>> I would
>> be interested in seeing a patch, explanation, and vote. I've  
>> already expressed
>> my agreement with many of the points you've raised, and I'm not  
>> the only one.
>
> I was only referring to a lack of expressed support for a fully  
> reflexive model.
>
> It's never been clear to me that there is a process for voting -  
> the decision making process within the commit group seems opaque.
>
>> It's pretty pointless for us to keep sending emails over proposed  
>> changes to the
>> code without actually seeing the changes.
>
> I think a change to the API could be decided without reference to  
> the code implementing that change. In fact, IMO the API *should* be  
> considered separately from the code implementing that change.  
> Otherwise APIs will tend to be decided not on the basis of design,  
> but on the amount of effort some person is prepared to spend to  
> demonstrate it, and hence code inertia, often resulting in  
> expedient solutions. This means that good, but expensive ideas, can  
> be lost.
>
> The models under discussion have evolved from simple name identity  
> by using '_id' and '_rev' everywhere, to a '_meta' wrapper, to  
> Geir's fully reflexive model.
>
> So I'd prefer to get buy-in to a model or principles, at which  
> point anyone could implement it. That's what I tried to do with the  
> change to the FS layout to support i18n, the committable  
> implementation of which is my focus right now.
>
> Antony Blakey
> -------------
> CTO, Linkuistics Pty Ltd
> Ph: 0438 840 787
>
> The intuitive mind is a sacred gift and the rational mind is a  
> faithful servant. We have created a society that honours the  
> servant and has forgotten the gift.
>   -- Albert Einstein
>
>

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Antony Blakey <an...@gmail.com>.

On 02/01/2009, at 2:17 PM, Noah Slater wrote:

> I appreciate you're frustrated with the current situation Antony,  
> but I think
> it's unfair for you to be claiming any kind of consensus without a  
> vote.

That post wasn't meant to be a criticism. Apologies if it felt like it  
was.

There isn't a clear consensus in this thread, which to my mind  
reflects the fact that there are trade-offs that don't have objective  
evaluation measures.

I fully support the idea that a product should reflect the vision and  
opinion of a very small group. Abstracting from my preference for a  
more robustly theoretical approach to API desig, the holistically best  
result is likely to arise from this model. So I don't e.g. mean  
'gatekeeper' in a negative way.

> I would
> be interested in seeing a patch, explanation, and vote. I've already  
> expressed
> my agreement with many of the points you've raised, and I'm not the  
> only one.

I was only referring to a lack of expressed support for a fully  
reflexive model.

It's never been clear to me that there is a process for voting - the  
decision making process within the commit group seems opaque.

> It's pretty pointless for us to keep sending emails over proposed  
> changes to the
> code without actually seeing the changes.

I think a change to the API could be decided without reference to the  
code implementing that change. In fact, IMO the API *should* be  
considered separately from the code implementing that change.  
Otherwise APIs will tend to be decided not on the basis of design, but  
on the amount of effort some person is prepared to spend to  
demonstrate it, and hence code inertia, often resulting in expedient  
solutions. This means that good, but expensive ideas, can be lost.

The models under discussion have evolved from simple name identity by  
using '_id' and '_rev' everywhere, to a '_meta' wrapper, to Geir's  
fully reflexive model.

So I'd prefer to get buy-in to a model or principles, at which point  
anyone could implement it. That's what I tried to do with the change  
to the FS layout to support i18n, the committable implementation of  
which is my focus right now.

Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

The intuitive mind is a sacred gift and the rational mind is a  
faithful servant. We have created a society that honours the servant  
and has forgotten the gift.
   -- Albert Einstein

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Noah Slater <ns...@apache.org>.

On Fri, Jan 02, 2009 at 09:15:21AM +1030, Antony Blakey wrote:
> No. The primary reason is "why change - the current mechanism has worked for a
> year". Damien (project lead) doesn't regard change as necessary, and a
> significant change to support top-level reflexivity (which is your primary
> thrust) doesn't have support from the other gatekeepers. There is some support
> for name identity, although I suspect not enough to prompt a change.

I appreciate you're frustrated with the current situation Antony, but I think
it's unfair for you to be claiming any kind of consensus without a vote. I would
be interested in seeing a patch, explanation, and vote. I've already expressed
my agreement with many of the points you've raised, and I'm not the only one.

It's pretty pointless for us to keep sending emails over proposed changes to the
code without actually seeing the changes. So, in the tradition of the Linux
kernel, show the code and let's have a vote!

-- 
Noah Slater, http://tumbolia.org/nslater

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Jan Lehnardt <ja...@apache.org>.

On 2 Jan 2009, at 02:24, Geir Magnusson Jr. wrote:

> BTW, for maximum utility,  I think that the view API will have to  
> change as well.  There's incredible power in the CDB view model, but  
> you'll want to be able to return a pure "user document" from a call  
> to a view (conform to some specific "schema"), rather than at least  
> what I understand is the current metadata-oriented structure.

Can you expand on that? With examples and all?

Cheers
Jan
--


>
> On Jan 1, 2009, at 7:53 PM, Damien Katz wrote:
>
>> Why can't you just always stick the desired document into an body  
>> field on the document? If you always do that, then you can round  
>> trip without problem.
>>
>> -Damien
>>
>>
>> On Jan 1, 2009, at 7:17 PM, Geir Magnusson Jr. wrote:
>>
>>>
>>> On Jan 1, 2009, at 7:14 PM, Adam Kocoloski wrote:
>>>
>>>> On Jan 1, 2009, at 4:45 PM, Geir Magnusson Jr. wrote:
>>>>
>>>>> b) I should have the choice to not have it injected at all
>>>>>
>>>>> So why do I think this is a problem?  The 10gen appserver auto- 
>>>>> injects an id field into the JSON documents that are stored in  
>>>>> our database, Mongo.  Can you guess what the key is?  Yep - "_id"
>>>>>
>>>>> So how can I roundtrip a doc from 10gen through couch and back?   
>>>>> I can't.
>>>>
>>>> Perhaps its worth noting that CouchDB is perfectly comfortable  
>>>> with externally generated _ids.  It only injects an _id if you  
>>>> create a new document without one.  Best,
>>>
>>> I understand that.
>>>
>>> I was just pointing out a real-world case where a JSON doc from  
>>> "somewhere else" runs into trouble...  (and yes, the issue applies  
>>> equally to the 10gen platform, when coming from "somewhere else" :)
>>>
>>> geir
>>>
>>>>
>>
>
>

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.

On Jan 1, 2009, at 9:14 PM, Flinn Mueller wrote:

> On Jan 1, 2009, at 8:24 PM, Geir Magnusson Jr. wrote:
>
>> I know I can do that.  And if CouchDB is the only "JSON source"  
>> that my apps are talking to, then that's fine - all apps can be  
>> written to expect that "schema".
>>
>> But I'm taking a different POV - where a "schema" exists outside of  
>> my app (a pseudo-standard defined by someone else) and I want to  
>> use CouchDB as a source of documents that conform to that schema.   
>> My apps should be able to consume documents in that "JSON schema"  
>> that are sourced from CouchDB, a httpd server returning static  
>> documents, some servlet app running in Tomcat, some .NET thingy, etc.
>>
>> Once you force me to store documents in a new format in order to  
>> protect data in my document that clashes w/ the server's metadata  
>> by sticking the document of interest in a top-level field :
>
>
> Isn't this any issue with any data store?  It's established that _id  
> is arbitrary just like it could be in just about any other data  
> store.  If this is a problem in couchdb it's a problem for you in  
> any data store isn't it?

If the data store chooses to inject it's metadata into my document,  
yes it will be, and that's my point.

>
>
>
>> {
>>   _rev : ...
>>   _id : ...
>>   mydata : { ... the real document ... }
>> }
>>
>> then I think that CDB loses something in terms of being a general  
>> JSON document store.
>
>
>
> You're looking at couchdb's document as if it's your JSON document.   
> It's couchdb's document and it happens to be JSON.

Exactly. That is what I note below - that CDB isn't a "general JSON  
store", it's a store that "renders" it's data in JSON.  (I hope it's  
clear that I think the world needs a "general JSON store" :)

>  There is nothing at all wrong with the above "schema" and it's  
> arguably the best way to store a document that you don't want to  
> conflict.  The couchdb document is always going to need metadata.   
> If it's not in _id then it's _farfagnugen and someone will  
> inevitably have the same issue.

Yep, exactly!  That's why I suggest formally separating the metadata  
out of the user data, and enhance the REST and view APIs so that you  
can get

a) just the user data (e.g. my AJAX app doesn't care or worry about  
the metadata)
b) both meta and user in the format like above
c) only meta for things where you don't care about the user data
d) maybe even a legacy mode where you inject meta into the userdata as  
it is today

geir

>
>
>
>
>> Now, I realize that no one ever said that CDB is a general JSON  
>> document store, rather it's a datastore that happens to return data  
>> in JSON.  The different is subtle, but very important.   It will be  
>> interesting to see how this space ("document databases") plays out,  
>> and if my concerns are valid.  Time will tell, I guess.
>>
>> BTW, for maximum utility,  I think that the view API will have to  
>> change as well.  There's incredible power in the CDB view model,  
>> but you'll want to be able to return a pure "user document" from a  
>> call to a view (conform to some specific "schema"), rather than at  
>> least what I understand is the current metadata-oriented structure.
>>
>> geir
>>
>>
>>
>>
>> On Jan 1, 2009, at 7:53 PM, Damien Katz wrote:
>>
>>> Why can't you just always stick the desired document into an body  
>>> field on the document? If you always do that, then you can round  
>>> trip without problem.
>>>
>>> -Damien
>>>
>>>
>>> On Jan 1, 2009, at 7:17 PM, Geir Magnusson Jr. wrote:
>>>
>>>>
>>>> On Jan 1, 2009, at 7:14 PM, Adam Kocoloski wrote:
>>>>
>>>>> On Jan 1, 2009, at 4:45 PM, Geir Magnusson Jr. wrote:
>>>>>
>>>>>> b) I should have the choice to not have it injected at all
>>>>>>
>>>>>> So why do I think this is a problem?  The 10gen appserver auto- 
>>>>>> injects an id field into the JSON documents that are stored in  
>>>>>> our database, Mongo.  Can you guess what the key is?  Yep - "_id"
>>>>>>
>>>>>> So how can I roundtrip a doc from 10gen through couch and  
>>>>>> back?  I can't.
>>>>>
>>>>> Perhaps its worth noting that CouchDB is perfectly comfortable  
>>>>> with externally generated _ids.  It only injects an _id if you  
>>>>> create a new document without one.  Best,
>>>>
>>>> I understand that.
>>>>
>>>> I was just pointing out a real-world case where a JSON doc from  
>>>> "somewhere else" runs into trouble...  (and yes, the issue  
>>>> applies equally to the 10gen platform, when coming from  
>>>> "somewhere else" :)
>>>>
>>>> geir
>>>>
>>>>>
>>>
>>
>

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Flinn Mueller <th...@gmail.com>.

On Jan 1, 2009, at 8:24 PM, Geir Magnusson Jr. wrote:

> I know I can do that.  And if CouchDB is the only "JSON source" that  
> my apps are talking to, then that's fine - all apps can be written  
> to expect that "schema".
>
> But I'm taking a different POV - where a "schema" exists outside of  
> my app (a pseudo-standard defined by someone else) and I want to use  
> CouchDB as a source of documents that conform to that schema.  My  
> apps should be able to consume documents in that "JSON schema" that  
> are sourced from CouchDB, a httpd server returning static documents,  
> some servlet app running in Tomcat, some .NET thingy, etc.
>
> Once you force me to store documents in a new format in order to  
> protect data in my document that clashes w/ the server's metadata by  
> sticking the document of interest in a top-level field :


Isn't this any issue with any data store?  It's established that _id  
is arbitrary just like it could be in just about any other data  
store.  If this is a problem in couchdb it's a problem for you in any  
data store isn't it?


> {
>    _rev : ...
>    _id : ...
>    mydata : { ... the real document ... }
> }
>
> then I think that CDB loses something in terms of being a general  
> JSON document store.



You're looking at couchdb's document as if it's your JSON document.   
It's couchdb's document and it happens to be JSON.  There is nothing  
at all wrong with the above "schema" and it's arguably the best way to  
store a document that you don't want to conflict.  The couchdb  
document is always going to need metadata.  If it's not in _id then  
it's _farfagnugen and someone will inevitably have the same issue.



> Now, I realize that no one ever said that CDB is a general JSON  
> document store, rather it's a datastore that happens to return data  
> in JSON.  The different is subtle, but very important.   It will be  
> interesting to see how this space ("document databases") plays out,  
> and if my concerns are valid.  Time will tell, I guess.
>
> BTW, for maximum utility,  I think that the view API will have to  
> change as well.  There's incredible power in the CDB view model, but  
> you'll want to be able to return a pure "user document" from a call  
> to a view (conform to some specific "schema"), rather than at least  
> what I understand is the current metadata-oriented structure.
>
> geir
>
>
>
>
> On Jan 1, 2009, at 7:53 PM, Damien Katz wrote:
>
>> Why can't you just always stick the desired document into an body  
>> field on the document? If you always do that, then you can round  
>> trip without problem.
>>
>> -Damien
>>
>>
>> On Jan 1, 2009, at 7:17 PM, Geir Magnusson Jr. wrote:
>>
>>>
>>> On Jan 1, 2009, at 7:14 PM, Adam Kocoloski wrote:
>>>
>>>> On Jan 1, 2009, at 4:45 PM, Geir Magnusson Jr. wrote:
>>>>
>>>>> b) I should have the choice to not have it injected at all
>>>>>
>>>>> So why do I think this is a problem?  The 10gen appserver auto- 
>>>>> injects an id field into the JSON documents that are stored in  
>>>>> our database, Mongo.  Can you guess what the key is?  Yep - "_id"
>>>>>
>>>>> So how can I roundtrip a doc from 10gen through couch and back?   
>>>>> I can't.
>>>>
>>>> Perhaps its worth noting that CouchDB is perfectly comfortable  
>>>> with externally generated _ids.  It only injects an _id if you  
>>>> create a new document without one.  Best,
>>>
>>> I understand that.
>>>
>>> I was just pointing out a real-world case where a JSON doc from  
>>> "somewhere else" runs into trouble...  (and yes, the issue applies  
>>> equally to the 10gen platform, when coming from "somewhere else" :)
>>>
>>> geir
>>>
>>>>
>>
>

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.

I know I can do that.  And if CouchDB is the only "JSON source" that  
my apps are talking to, then that's fine - all apps can be written to  
expect that "schema".

But I'm taking a different POV - where a "schema" exists outside of my  
app (a pseudo-standard defined by someone else) and I want to use  
CouchDB as a source of documents that conform to that schema.  My apps  
should be able to consume documents in that "JSON schema" that are  
sourced from CouchDB, a httpd server returning static documents, some  
servlet app running in Tomcat, some .NET thingy, etc.

Once you force me to store documents in a new format in order to  
protect data in my document that clashes w/ the server's metadata by  
sticking the document of interest in a top-level field :

  {
     _rev : ...
     _id : ...
     mydata : { ... the real document ... }
  }

then I think that CDB loses something in terms of being a general JSON  
document store.

Now, I realize that no one ever said that CDB is a general JSON  
document store, rather it's a datastore that happens to return data in  
JSON.  The different is subtle, but very important.   It will be  
interesting to see how this space ("document databases") plays out,  
and if my concerns are valid.  Time will tell, I guess.

BTW, for maximum utility,  I think that the view API will have to  
change as well.  There's incredible power in the CDB view model, but  
you'll want to be able to return a pure "user document" from a call to  
a view (conform to some specific "schema"), rather than at least what  
I understand is the current metadata-oriented structure.

geir

On Jan 1, 2009, at 7:53 PM, Damien Katz wrote:

> Why can't you just always stick the desired document into an body  
> field on the document? If you always do that, then you can round  
> trip without problem.
>
> -Damien
>
>
> On Jan 1, 2009, at 7:17 PM, Geir Magnusson Jr. wrote:
>
>>
>> On Jan 1, 2009, at 7:14 PM, Adam Kocoloski wrote:
>>
>>> On Jan 1, 2009, at 4:45 PM, Geir Magnusson Jr. wrote:
>>>
>>>> b) I should have the choice to not have it injected at all
>>>>
>>>> So why do I think this is a problem?  The 10gen appserver auto- 
>>>> injects an id field into the JSON documents that are stored in  
>>>> our database, Mongo.  Can you guess what the key is?  Yep - "_id"
>>>>
>>>> So how can I roundtrip a doc from 10gen through couch and back?   
>>>> I can't.
>>>
>>> Perhaps its worth noting that CouchDB is perfectly comfortable  
>>> with externally generated _ids.  It only injects an _id if you  
>>> create a new document without one.  Best,
>>
>> I understand that.
>>
>> I was just pointing out a real-world case where a JSON doc from  
>> "somewhere else" runs into trouble...  (and yes, the issue applies  
>> equally to the 10gen platform, when coming from "somewhere else" :)
>>
>> geir
>>
>>>
>

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Antony Blakey <an...@gmail.com>.

On 02/01/2009, at 11:23 AM, Damien Katz wrote:

> Why can't you just always stick the desired document into an body  
> field on the document? If you always do that, then you can round  
> trip without problem.

Sure, you can always do this:

{
  _id: ...
  _rev: ...
  data: {
    ... the user's document ...
  }
}

as a matter of policy in your own application.

Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

Some defeats are instalments to victory.
   -- Jacob Riis

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Damien Katz <da...@apache.org>.

Why can't you just always stick the desired document into an body  
field on the document? If you always do that, then you can round trip  
without problem.

-Damien


On Jan 1, 2009, at 7:17 PM, Geir Magnusson Jr. wrote:

>
> On Jan 1, 2009, at 7:14 PM, Adam Kocoloski wrote:
>
>> On Jan 1, 2009, at 4:45 PM, Geir Magnusson Jr. wrote:
>>
>>> b) I should have the choice to not have it injected at all
>>>
>>> So why do I think this is a problem?  The 10gen appserver auto- 
>>> injects an id field into the JSON documents that are stored in our  
>>> database, Mongo.  Can you guess what the key is?  Yep - "_id"
>>>
>>> So how can I roundtrip a doc from 10gen through couch and back?  I  
>>> can't.
>>
>> Perhaps its worth noting that CouchDB is perfectly comfortable with  
>> externally generated _ids.  It only injects an _id if you create a  
>> new document without one.  Best,
>
> I understand that.
>
> I was just pointing out a real-world case where a JSON doc from  
> "somewhere else" runs into trouble...  (and yes, the issue applies  
> equally to the 10gen platform, when coming from "somewhere else" :)
>
> geir
>
>>

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.

On Jan 1, 2009, at 7:14 PM, Adam Kocoloski wrote:

> On Jan 1, 2009, at 4:45 PM, Geir Magnusson Jr. wrote:
>
>> b) I should have the choice to not have it injected at all
>>
>> So why do I think this is a problem?  The 10gen appserver auto- 
>> injects an id field into the JSON documents that are stored in our  
>> database, Mongo.  Can you guess what the key is?  Yep - "_id"
>>
>> So how can I roundtrip a doc from 10gen through couch and back?  I  
>> can't.
>
> Perhaps its worth noting that CouchDB is perfectly comfortable with  
> externally generated _ids.  It only injects an _id if you create a  
> new document without one.  Best,

I understand that.

I was just pointing out a real-world case where a JSON doc from  
"somewhere else" runs into trouble...  (and yes, the issue applies  
equally to the 10gen platform, when coming from "somewhere else" :)

geir

>

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Adam Kocoloski <ad...@gmail.com>.

On Jan 1, 2009, at 4:45 PM, Geir Magnusson Jr. wrote:

> b) I should have the choice to not have it injected at all
>
> So why do I think this is a problem?  The 10gen appserver auto- 
> injects an id field into the JSON documents that are stored in our  
> database, Mongo.  Can you guess what the key is?  Yep - "_id"
>
> So how can I roundtrip a doc from 10gen through couch and back?  I  
> can't.

Perhaps its worth noting that CouchDB is perfectly comfortable with  
externally generated _ids.  It only injects an _id if you create a new  
document without one.  Best,

Adam

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Antony Blakey <an...@gmail.com>.

On 02/01/2009, at 9:32 PM, Jan Lehnardt wrote:

>
> On 2 Jan 2009, at 01:16, Geir Magnusson Jr. wrote:
>
>> ok - that's the first time in the 9 years I've been associated with  
>> the ASF that I've heard that term.  Sounds kinda like they're goal  
>> is to keep things out, rather than get people and ideas in and  
>> involved :)
>
> This is not an ASF-term (afaik), but Anthony's. :)

Correct. I use it because I though it reflected the decision making  
structure, and I agree with that structure i.e. a community that makes  
arguments to convince a committee, said committee being the actual  
decision makers. In this case, the people with commit rights are  
effectively the decision makers, because Couch will be what they commit.

Maybe the term causes distress. I'll cease and desist.

Antony Blakey
--------------------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

Only two things are infinite, the universe and human stupidity, and  
I'm not sure about the former.
  -- Albert Einstein

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Jan Lehnardt <ja...@apache.org>.

On 2 Jan 2009, at 01:16, Geir Magnusson Jr. wrote:

> ok - that's the first time in the 9 years I've been associated with  
> the ASF that I've heard that term.  Sounds kinda like they're goal  
> is to keep things out, rather than get people and ideas in and  
> involved :)

This is not an ASF-term (afaik), but Anthony's. :)

Cheers
Jan
--

>
>
> geir
>
> On Jan 1, 2009, at 6:51 PM, Antony Blakey wrote:
>
>>
>> On 02/01/2009, at 9:42 AM, Geir Magnusson Jr. wrote:
>>
>>> What is this "gatekeeper" thing I keep hearing about?  Do you mean  
>>> committer?
>>
>> Yes.
>>
>> Antony Blakey
>> --------------------------
>> CTO, Linkuistics Pty Ltd
>> Ph: 0438 840 787
>>
>> Man will never be free until the last king is strangled with the  
>> entrails of the last priest.
>> -- Denis Diderot
>>
>
>

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.

ok - that's the first time in the 9 years I've been associated with  
the ASF that I've heard that term.  Sounds kinda like they're goal is  
to keep things out, rather than get people and ideas in and involved :)

geir

On Jan 1, 2009, at 6:51 PM, Antony Blakey wrote:

>
> On 02/01/2009, at 9:42 AM, Geir Magnusson Jr. wrote:
>
>> What is this "gatekeeper" thing I keep hearing about?  Do you mean  
>> committer?
>
> Yes.
>
> Antony Blakey
> --------------------------
> CTO, Linkuistics Pty Ltd
> Ph: 0438 840 787
>
> Man will never be free until the last king is strangled with the  
> entrails of the last priest.
>  -- Denis Diderot
>

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Antony Blakey <an...@gmail.com>.

On 02/01/2009, at 9:42 AM, Geir Magnusson Jr. wrote:

> What is this "gatekeeper" thing I keep hearing about?  Do you mean  
> committer?

Yes.

Antony Blakey
--------------------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

Man will never be free until the last king is strangled with the  
entrails of the last priest.
   -- Denis Diderot

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.

What is this "gatekeeper" thing I keep hearing about?  Do you mean  
committer?

On Jan 1, 2009, at 5:45 PM, Antony Blakey wrote:

>
> On 02/01/2009, at 8:15 AM, Geir Magnusson Jr. wrote:
>
>> If you could point me to an explanation of why changing this is  
>> bad, I'd love to catch up on the discussion.  I assume it's a  
>> technical reason?
>
> No. The primary reason is "why change - the current mechanism has  
> worked for a year". Damien (project lead) doesn't regard change as  
> necessary, and a significant change to support top-level reflexivity  
> (which is your primary thrust) doesn't have support from the other  
> gatekeepers. There is some support for name identity, although I  
> suspect not enough to prompt a change.
>
> Antony Blakey
> -------------
> CTO, Linkuistics Pty Ltd
> Ph: 0438 840 787
>
> There are two ways of constructing a software design: One way is to  
> make it so simple that there are obviously no deficiencies, and the  
> other way is to make it so complicated that there are no obvious  
> deficiencies.
>  -- C. A. R. Hoare
>
>

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.

On Jan 1, 2009, at 5:45 PM, Antony Blakey wrote:

>
> On 02/01/2009, at 8:15 AM, Geir Magnusson Jr. wrote:
>
>> If you could point me to an explanation of why changing this is  
>> bad, I'd love to catch up on the discussion.  I assume it's a  
>> technical reason?
>
> No. The primary reason is "why change - the current mechanism has  
> worked for a year". Damien (project lead) doesn't regard change as  
> necessary, and a significant change to support top-level reflexivity  
> (which is your primary thrust) doesn't have support from the other  
> gatekeepers. There is some support for name identity, although I  
> suspect not enough to prompt a change.
>

Alright.  I hereby cry "Uncle" and will let this go (for a while,  
anyway...)

Thanks all for such an interesting conversation.  I look forward to  
more in this community, as I really find document and other  
"alternative" databases interesting, and think they are going to be  
key for "cloud" computing...

/me wanders off to try and get more that 6 docs/sec into Couch...

geir

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Antony Blakey <an...@gmail.com>.

On 02/01/2009, at 8:15 AM, Geir Magnusson Jr. wrote:

> If you could point me to an explanation of why changing this is bad,  
> I'd love to catch up on the discussion.  I assume it's a technical  
> reason?

No. The primary reason is "why change - the current mechanism has  
worked for a year". Damien (project lead) doesn't regard change as  
necessary, and a significant change to support top-level reflexivity  
(which is your primary thrust) doesn't have support from the other  
gatekeepers. There is some support for name identity, although I  
suspect not enough to prompt a change.

Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

There are two ways of constructing a software design: One way is to  
make it so simple that there are obviously no deficiencies, and the  
other way is to make it so complicated that there are no obvious  
deficiencies.
   -- C. A. R. Hoare

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.

On Dec 31, 2008, at 7:40 PM, Antony Blakey wrote:

>
> On 31/12/2008, at 11:29 PM, Geir Magnusson Jr. wrote:
>
>> What trouble?  I think this is *exactly* what should be done - have  
>> CouchDB store documents that are :
>>
>> {
>>   metadata : { _rev : X, _id : Y, _woogie: Z, .... anything that  
>> needs to be added in the future, like other metadata like last  
>> update date... },
>>   userdata : {  .... the document you want to store .... }
>> }
>>
>> and then offer APIs that let you :
>>
>> a) get to this document, for libraries and clients that know they  
>> are talking to Couch and want to manipulate at this level
>>
>> b) return and accept the userdocument directly, for clients that  
>> just want to consume or produce  JSON data, w/o caring about the  
>> internal housekeeping
>
> One of the issues complicating the logic of this discussion is that  
> the document id is both metadata and, conceptually, a document member.

Well, I don't understand why it has to be. Certainly it's a  
convenience, and I wonder how much of current thinking has been  
influenced by the fact that this what people are used to.

I can understand why CDB needs a unique document identifier, and it  
certainly would be nice to have the option of having it shoved into  
the user doc on creation.  But

a) I think that I should have the choice as to what that identifier  
is  (e.g.  Configure the database to inject the couch metadata _id as  
"_couchID" or whatever...)

b) I should have the choice to not have it injected at all

So why do I think this is a problem?  The 10gen appserver auto-injects  
an id field into the JSON documents that are stored in our database,  
Mongo.  Can you guess what the key is?  Yep - "_id"

So how can I roundtrip a doc from 10gen through couch and back?  I  
can't.

I've made the same argument at 10gen - that I should be able to set  
the identifier (and that it shouldn't be in the doc in the first place).

Then, I'd just have a doc with

{
    _couchID : ....
    _mongoID : ....
     ... data...
}

(if I chose to shove the ID into the doc)

> That's why, although the purest model is to have the userdata as a  
> member within a Couch document as you suggest, this doesn't look  
> that appealing:
>
> {
>  metadata: {
>    id: ...
>    rev: ...
>    ...
>  }
>  data: {
>    ... the user's document ...
>  }
> }

I can see how this isn't appealing from the perspective of current  
API's, but a rethinking of this issue (_id and _rev) also warrants a  
re-thinking of the APIs to deal with this.

E.g. an API that lets me get a) the whole doc above  b) metadata only  
c) userdata only

>
>
> Furthermore, from a scalability perspective, always having the  
> metadata when you have the document, isn't a problem - the metadata  
> is constrained.

And from what I understand, it already exists in that manner, right?   
I mean, for efficiency, I'd guess that the _id, _rev and in the  
future, other metadata (like insert date, last modificationdate...)  
would be kept outside of the doc, so that they can be read and updated  
w/o having to serialized/deserialize the whole user document.

> The reverse situation of always having the data when you have the  
> metadata, is not constrained because the data is arbitrarily large.  
> IMO this means that a solution such as this:
>
> {
>  id: ...
>  rev: ...
>  ...
>  data: {
>    ... the user's document ...
>  }
> }
>
> isn't such a good idea compared to this:
>
> {
>  _metadata: {
>    id: ...
>    rev: ...
>  }
>  ... the user's document ...
> }

That only solves the problem in that there's only one reserved magical  
key (_metadata), but I don't think that really changes anything.  You  
still need to make sure any document you want to store in couch  
doesn't have a top-level _metadata element.

And while I don't know how couch works internally, we *are* really  
only talking about how the data is returned on an API call via the  
REST API or what I assume is an internal API for the M/R View stuff.

If you had an API that let you choose all, metaonly or useronly, you  
could not be burdened with stuff you didn't want or need.

>
> Unfortunately the reserved token makes the structure non-reflexive  
> without transformation, and although that's not currently an issue,  
> I can imagine it complicating certain use-cases. It makes the system  
> more complicated to reason about.
>
> I'm struggling to objectively evaluate this model and your reflexive  
> model - given Damien's attitude to this issue, my motivation to do  
> so is somewhat depressed :/

If you could point me to an explanation of why changing this is bad,  
I'd love to catch up on the discussion.  I assume it's a technical  
reason?

geir

>
>
> Antony Blakey
> -------------
> CTO, Linkuistics Pty Ltd
> Ph: 0438 840 787
>
> Did you hear about the Buddhist who refused Novocain during a root  
> canal?
> His goal: transcend dental medication.
>
>

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Antony Blakey <an...@gmail.com>.

On 31/12/2008, at 11:29 PM, Geir Magnusson Jr. wrote:

> What trouble?  I think this is *exactly* what should be done - have  
> CouchDB store documents that are :
>
> {
>    metadata : { _rev : X, _id : Y, _woogie: Z, .... anything that  
> needs to be added in the future, like other metadata like last  
> update date... },
>    userdata : {  .... the document you want to store .... }
> }
>
> and then offer APIs that let you :
>
> a) get to this document, for libraries and clients that know they  
> are talking to Couch and want to manipulate at this level
>
> b) return and accept the userdocument directly, for clients that  
> just want to consume or produce  JSON data, w/o caring about the  
> internal housekeeping

One of the issues complicating the logic of this discussion is that  
the document id is both metadata and, conceptually, a document member.  
That's why, although the purest model is to have the userdata as a  
member within a Couch document as you suggest, this doesn't look that  
appealing:

{
   metadata: {
     id: ...
     rev: ...
     ...
   }
   data: {
     ... the user's document ...
   }
}

Furthermore, from a scalability perspective, always having the  
metadata when you have the document, isn't a problem - the metadata is  
constrained. The reverse situation of always having the data when you  
have the metadata, is not constrained because the data is arbitrarily  
large. IMO this means that a solution such as this:

{
   id: ...
   rev: ...
   ...
   data: {
     ... the user's document ...
   }
}

isn't such a good idea compared to this:

{
   _metadata: {
     id: ...
     rev: ...
   }
   ... the user's document ...
}

Unfortunately the reserved token makes the structure non-reflexive  
without transformation, and although that's not currently an issue, I  
can imagine it complicating certain use-cases. It makes the system  
more complicated to reason about.

I'm struggling to objectively evaluate this model and your reflexive  
model - given Damien's attitude to this issue, my motivation to do so  
is somewhat depressed :/

Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

Did you hear about the Buddhist who refused Novocain during a root  
canal?
His goal: transcend dental medication.

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.

On Dec 31, 2008, at 7:37 AM, Robert Dionne wrote:

>
> On Dec 30, 2008, at 8:33 PM, Antony Blakey wrote:

>>
>> 1. The current scheme of prepending _ to atom names when the atom  
>> is used inside a document. Con is the breakage of name identity,  
>> which has technical consequences as well as cognitive ones. Does  
>> the rule only apply at the top level of a document? What about  
>> future injected metadata that has internal structure?
>>
>> 2. Use '_' for all atoms, inside and outside documents. Con is the  
>> noise of extra underscores everywhere.
>>
>> 3. Don't use underscores inside documents - for id and rev at  
>> least, this wouldn't seem to be a big issue, but isn't future-proof  
>> if you want to handle other injected fields.
>>
>> 4. Use '_' for atoms that have to be injected, and make the name BE  
>> the '_' form. Con is that you have to decide in advance if an atom  
>> is going to ever be injected.
>>
>> 5. Use a '_meta' wrapper for the metadata. I don't see any  
>> technical cons, and IMO is by far the cleanest model. Name identity  
>> is preserved, it's arbitrarily extensible without scalability  
>> concerns, and is structural rather than lexical.
>
> It is clearly cleaner and has it's advantages, however I have to  
> agree with an earlier poster; "Putting them in a _meta group might  
> encourage aggregation and manipulation of the bookkeeping metadata  
> separately from the document, which to me sounds like a recipe for  
> trouble."

What trouble?  I think this is *exactly* what should be done - have  
CouchDB store documents that are :

  {
     metadata : { _rev : X, _id : Y, _woogie: Z, .... anything that  
needs to be added in the future, like other metadata like last update  
date... },
     userdata : {  .... the document you want to store .... }
  }

and then offer APIs that let you :

a) get to this document, for libraries and clients that know they are  
talking to Couch and want to manipulate at this level

b) return and accept the userdocument directly, for clients that just  
want to consume or produce  JSON data, w/o caring about the internal  
housekeeping

>
>  This would be a more complex design than the current use of the  
> underscore at the top level of documents and would definitely  
> encourage a quite different implementation. I don't know the  
> internals enough yet to comment on this. The code there to date is  
> remarkably terse for what it does but this may just reflect the use  
> of Erlang.

I just have trouble seeing this POV - it seems to me that having a  
reserved "namespace" ( the _.*) at a specific level (the top level)  
the user document to put the metadata makes things more complex.  Not  
only is it an exception to what a person can store in couch, but it  
itself contains an exception - it only applies to top level.   
Consumers and producers have to be aware that the documents are coming  
from Couch (consumers have to know that _id and _rev are medatadata  
and should be ignored for application purposes, but only if in the top  
level...) and producers have to avoid using _id and _rev for  
application data....

Then any "couch aware" code I write can safely know that anything  
that's couch specific is in doc.metadata  and anything that's the  
stored user data is data.userdata, and never the beams shall be  
crossed.  Any apps I write (say AJAX stuff) don't need to special-case  
the handling of the responses, since anything in a user doc is user  
data, and I should be able to make requests that just return that  
userdata

geir

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Robert Dionne <bo...@gmail.com>.

On Dec 30, 2008, at 8:33 PM, Antony Blakey wrote:

>
> On 30/12/2008, at 1:40 AM, Robert Dionne wrote:
>
>> With respect to a meta structure, I was going to make this comment  
>> yesterday as I think Geir was arguing for this:
>>
>> It seems to me that occam's razor argues for the simplicity of a  
>> single JSON doc, rather that a "metadoc" envelope that contains  
>> another JSON doc embedded in it. It's not clear to me that  
>> creating this separation of concerns buys anything at all. The use  
>> of an underscore to designate distinguished fields at the top  
>> level is a fairly easy convention to get your arms around.
>
> That's not actually the issue. The issue is about having a single  
> name, and not inventing a namespace technique for json docs. The  
> choices are:

I understand the issue. I noted the use of _id versus id myself and  
wasn't that put off by it, just seemed a quirk of the implementation.  
I realize you've likely written a lot of code a this point and have  
run into reuse issues. It's not unusual to have different names for  
the same thing if the context is different.


>
> 1. The current scheme of prepending _ to atom names when the atom  
> is used inside a document. Con is the breakage of name identity,  
> which has technical consequences as well as cognitive ones. Does  
> the rule only apply at the top level of a document? What about  
> future injected metadata that has internal structure?
>
> 2. Use '_' for all atoms, inside and outside documents. Con is the  
> noise of extra underscores everywhere.
>
> 3. Don't use underscores inside documents - for id and rev at  
> least, this wouldn't seem to be a big issue, but isn't future-proof  
> if you want to handle other injected fields.
>
> 4. Use '_' for atoms that have to be injected, and make the name BE  
> the '_' form. Con is that you have to decide in advance if an atom  
> is going to ever be injected.
>
> 5. Use a '_meta' wrapper for the metadata. I don't see any  
> technical cons, and IMO is by far the cleanest model. Name identity  
> is preserved, it's arbitrarily extensible without scalability  
> concerns, and is structural rather than lexical.

  It is clearly cleaner and has it's advantages, however I have to  
agree with an earlier poster; "Putting them in a _meta group might  
encourage aggregation and manipulation of the bookkeeping metadata  
separately from the document, which to me sounds like a recipe for  
trouble."

   This would be a more complex design than the current use of the  
underscore at the top level of documents and would definitely  
encourage a quite different implementation. I don't know the  
internals enough yet to comment on this. The code there to date is  
remarkably terse for what it does but this may just reflect the use  
of Erlang.

Cheers,

Bob



>
> IMO option 5 is the best and cleanest solution.
>
> Antony Blakey
> -------------
> CTO, Linkuistics Pty Ltd
> Ph: 0438 840 787
>
> One should respect public opinion insofar as is necessary to avoid  
> starvation and keep out of prison, but anything that goes beyond  
> this is voluntary submission to an unnecessary tyranny.
>   -- Bertrand Russell
>
>

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Antony Blakey <an...@gmail.com>.

On 30/12/2008, at 1:40 AM, Robert Dionne wrote:

> With respect to a meta structure, I was going to make this comment  
> yesterday as I think Geir was arguing for this:
>
> It seems to me that occam's razor argues for the simplicity of a  
> single JSON doc, rather that a "metadoc" envelope that contains  
> another JSON doc embedded in it. It's not clear to me that creating  
> this separation of concerns buys anything at all. The use of an  
> underscore to designate distinguished fields at the top level is a  
> fairly easy convention to get your arms around.

That's not actually the issue. The issue is about having a single  
name, and not inventing a namespace technique for json docs. The  
choices are:

1. The current scheme of prepending _ to atom names when the atom is  
used inside a document. Con is the breakage of name identity, which  
has technical consequences as well as cognitive ones. Does the rule  
only apply at the top level of a document? What about future injected  
metadata that has internal structure?

2. Use '_' for all atoms, inside and outside documents. Con is the  
noise of extra underscores everywhere.

3. Don't use underscores inside documents - for id and rev at least,  
this wouldn't seem to be a big issue, but isn't future-proof if you  
want to handle other injected fields.

4. Use '_' for atoms that have to be injected, and make the name BE  
the '_' form. Con is that you have to decide in advance if an atom is  
going to ever be injected.

5. Use a '_meta' wrapper for the metadata. I don't see any technical  
cons, and IMO is by far the cleanest model. Name identity is  
preserved, it's arbitrarily extensible without scalability concerns,  
and is structural rather than lexical.

IMO option 5 is the best and cleanest solution.

Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

One should respect public opinion insofar as is necessary to avoid  
starvation and keep out of prison, but anything that goes beyond this  
is voluntary submission to an unnecessary tyranny.
   -- Bertrand Russell

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Robert Dionne <bo...@gmail.com>.

With respect to a meta structure, I was going to make this comment  
yesterday as I think Geir was arguing for this:

It seems to me that occam's razor argues for the simplicity of a  
single JSON doc, rather that a "metadoc" envelope that contains  
another JSON doc embedded in it. It's not clear to me that creating  
this separation of concerns buys anything at all. The use of an  
underscore to designate distinguished fields at the top level is a  
fairly easy convention to get your arms around. It also provides a  
nice convention for extensions, .eg. "_external". Does it blur the  
distinction between data and metadata, yes but I think that's a good  
thing.

Perhaps it's useful to turn the argument around and ask what having a  
separate metadoc buys you? You now presumably store the unique id in  
one JSON fragment and the actual doc in another, how does this  
simplify the code?

I guess I'm not a big fan of metadata

On Dec 29, 2008, at 9:21 AM, Damien Katz wrote:

> Yes, it is perfectly clear to a newbie, because it's the simplest  
> case. That's why I coded it this way initially, it seemed simpler.
>
> What isn't immediately obvious  is all the other special fields  
> that can appear in documents and in other contexts. How to make  
> that consistent? I tried, but couldn't keep it simple. The problem  
> was special names in various structures no longer have a simple  
> rule to follow, but instead you must know if this field appears in  
> a document at any time, then it starts with underscore in other  
> structures.
>
> The current rule maybe not the most intuitive to a newbie, but it  
> is far more consistent and easier to work with then when using the  
> deeper APIs. The only 2 other workable solutions I see is to either  
> stuff everything special into a _meta structure or only use HTTP  
> headers for all CouchDB meta information. But after having spent  
> much time thinking about this issue, I think the current rule is  
> the better compromise.
>
> -Damien
>
> On Dec 29, 2008, at 8:58 AM, Geir Magnusson Jr. wrote:
>
>> That rule would have been perfectly clear to me as a newbie.
>>
>> +1
>>
>> geir
>>
>> On Dec 29, 2008, at 8:55 AM, Antony Blakey wrote:
>>
>>>
>>> On 29/12/2008, at 11:11 PM, Damien Katz wrote:
>>>
>>>> The problem was there where other reserved fields in documents  
>>>> that started with underscore, but in other places the fields  
>>>> wouldn't have an underscore. Keep track of which fieldname had  
>>>> underscores and where became confusing. The rule was changed to  
>>>> be simpler to understand and deal with.
>>>
>>> A simpler rule is: _rev is the name no matter where it appears,  
>>> same with _id. I'd go so far as say that this kind of rule is so  
>>> fundamental to our idea of identity and naming, that it doesn't  
>>> even count as a rule. And there had better be a really good  
>>> reason to introduce a rule contrary to such an strongly implicit  
>>> and intrinsic concept.
>>>
>>> And as far as 'Keeping track of which fieldname had underscores",  
>>> it would seem that the current situation is the worst, because  
>>> you have to keep track not based on identity e.g. _rev and _id,  
>>> but rather on context, which is a dynamic and more intellectually  
>>> demanding concept than semantic identity. Furthermore, in this  
>>> scheme, names must be mapped under structural transformation  
>>> (such as copying the _id and _rev fields from one context to  
>>> another), which complicates generic transformations.
>>>
>>> IMO the name isn't "rev" with sometimes an underscore, rather the  
>>> name IS "_rev". Same with "_id".
>>>
>>> A single name for a concept, lexically consistent, is less  
>>> cognitive load both initially and on an ongoing basis.
>>>
>>> Antony Blakey
>>> -------------
>>> CTO, Linkuistics Pty Ltd
>>> Ph: 0438 840 787
>>>
>>> The ultimate measure of a man is not where he stands in moments  
>>> of comfort and convenience, but where he stands at times of  
>>> challenge and controversy.
>>> -- Martin Luther King
>>>
>>>
>>
>

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Chris Anderson <jc...@gmail.com>.

On Mon, Dec 29, 2008 at 6:21 AM, Damien Katz <da...@apache.org> wrote:
>
> What isn't immediately obvious  is all the other special fields that can
> appear in documents and in other contexts. How to make that consistent? I
> tried, but couldn't keep it simple. The problem was special names in various
> structures no longer have a simple rule to follow, but instead you must know
> if this field appears in a document at any time, then it starts with
> underscore in other structures.
>
> The current rule maybe not the most intuitive to a newbie, but it is far
> more consistent and easier to work with then when using the deeper APIs. The
> only 2 other workable solutions I see is to either stuff everything special
> into a _meta structure or only use HTTP headers for all CouchDB meta
> information. But after having spent much time thinking about this issue, I
> think the current rule is the better compromise.
>

I've got to say I agree with Damien and Christopher Lenz on this one.
Prefixing "id" and "rev" with "_" in query params and in view rows
just seems silly to me. The current rules, where document fields like
"_attachments" have special meaning in CouchDB, seems simplest in the
long run. The "_" is used to qualify namespace, not name a field. This
convention is reused in CouchDB, for non-docid paths, such as _view
and _all_docs. Once you understand why the current system is built as
it is, the namespace conventions are perfectly clear.

I can see the appeal of a single reserved "_meta" field, which
contains CouchDB bookkeeping data, but I think it could also turn out
to be a step in the wrong direction. _id and _rev are meant to be
brittle: that is, changing them on a document, once they are present,
means you have stepped outside the CouchDB semantics. Putting them in
a _meta group might encourage aggregation and manipulation of the
bookkeeping metadata separately from the document, which to me sounds
like a recipe for trouble. A "_meta" field would also be an
inappropriate place to put "_attachments".

I'm not particularly sympathetic to arguments that it's harmful to
reserve top-level fields that begin with "_". Just put your data in a
doc.actually_the_data field, if you need field names that start with
underscore.

-- 
Chris Anderson
http://jchris.mfdz.com

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Damien Katz <da...@apache.org>.

Yes, it is perfectly clear to a newbie, because it's the simplest  
case. That's why I coded it this way initially, it seemed simpler.

What isn't immediately obvious  is all the other special fields that  
can appear in documents and in other contexts. How to make that  
consistent? I tried, but couldn't keep it simple. The problem was  
special names in various structures no longer have a simple rule to  
follow, but instead you must know if this field appears in a document  
at any time, then it starts with underscore in other structures.

The current rule maybe not the most intuitive to a newbie, but it is  
far more consistent and easier to work with then when using the deeper  
APIs. The only 2 other workable solutions I see is to either stuff  
everything special into a _meta structure or only use HTTP headers for  
all CouchDB meta information. But after having spent much time  
thinking about this issue, I think the current rule is the better  
compromise.

-Damien

On Dec 29, 2008, at 8:58 AM, Geir Magnusson Jr. wrote:

> That rule would have been perfectly clear to me as a newbie.
>
> +1
>
> geir
>
> On Dec 29, 2008, at 8:55 AM, Antony Blakey wrote:
>
>>
>> On 29/12/2008, at 11:11 PM, Damien Katz wrote:
>>
>>> The problem was there where other reserved fields in documents  
>>> that started with underscore, but in other places the fields  
>>> wouldn't have an underscore. Keep track of which fieldname had  
>>> underscores and where became confusing. The rule was changed to be  
>>> simpler to understand and deal with.
>>
>> A simpler rule is: _rev is the name no matter where it appears,  
>> same with _id. I'd go so far as say that this kind of rule is so  
>> fundamental to our idea of identity and naming, that it doesn't  
>> even count as a rule. And there had better be a really good reason  
>> to introduce a rule contrary to such an strongly implicit and  
>> intrinsic concept.
>>
>> And as far as 'Keeping track of which fieldname had underscores",  
>> it would seem that the current situation is the worst, because you  
>> have to keep track not based on identity e.g. _rev and _id, but  
>> rather on context, which is a dynamic and more intellectually  
>> demanding concept than semantic identity. Furthermore, in this  
>> scheme, names must be mapped under structural transformation (such  
>> as copying the _id and _rev fields from one context to another),  
>> which complicates generic transformations.
>>
>> IMO the name isn't "rev" with sometimes an underscore, rather the  
>> name IS "_rev". Same with "_id".
>>
>> A single name for a concept, lexically consistent, is less  
>> cognitive load both initially and on an ongoing basis.
>>
>> Antony Blakey
>> -------------
>> CTO, Linkuistics Pty Ltd
>> Ph: 0438 840 787
>>
>> The ultimate measure of a man is not where he stands in moments of  
>> comfort and convenience, but where he stands at times of challenge  
>> and controversy.
>> -- Martin Luther King
>>
>>
>

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.

That rule would have been perfectly clear to me as a newbie.

+1

geir

On Dec 29, 2008, at 8:55 AM, Antony Blakey wrote:

>
> On 29/12/2008, at 11:11 PM, Damien Katz wrote:
>
>> The problem was there where other reserved fields in documents that  
>> started with underscore, but in other places the fields wouldn't  
>> have an underscore. Keep track of which fieldname had underscores  
>> and where became confusing. The rule was changed to be simpler to  
>> understand and deal with.
>
> A simpler rule is: _rev is the name no matter where it appears, same  
> with _id. I'd go so far as say that this kind of rule is so  
> fundamental to our idea of identity and naming, that it doesn't even  
> count as a rule. And there had better be a really good reason to  
> introduce a rule contrary to such an strongly implicit and intrinsic  
> concept.
>
> And as far as 'Keeping track of which fieldname had underscores", it  
> would seem that the current situation is the worst, because you have  
> to keep track not based on identity e.g. _rev and _id, but rather on  
> context, which is a dynamic and more intellectually demanding  
> concept than semantic identity. Furthermore, in this scheme, names  
> must be mapped under structural transformation (such as copying the  
> _id and _rev fields from one context to another), which complicates  
> generic transformations.
>
> IMO the name isn't "rev" with sometimes an underscore, rather the  
> name IS "_rev". Same with "_id".
>
> A single name for a concept, lexically consistent, is less cognitive  
> load both initially and on an ongoing basis.
>
> Antony Blakey
> -------------
> CTO, Linkuistics Pty Ltd
> Ph: 0438 840 787
>
> The ultimate measure of a man is not where he stands in moments of  
> comfort and convenience, but where he stands at times of challenge  
> and controversy.
>  -- Martin Luther King
>
>

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.

That rule would have been perfectly clear to me as a newbie.

+1

geir

On Dec 29, 2008, at 8:55 AM, Antony Blakey wrote:

>
> On 29/12/2008, at 11:11 PM, Damien Katz wrote:
>
>> The problem was there where other reserved fields in documents that  
>> started with underscore, but in other places the fields wouldn't  
>> have an underscore. Keep track of which fieldname had underscores  
>> and where became confusing. The rule was changed to be simpler to  
>> understand and deal with.
>
> A simpler rule is: _rev is the name no matter where it appears, same  
> with _id. I'd go so far as say that this kind of rule is so  
> fundamental to our idea of identity and naming, that it doesn't even  
> count as a rule. And there had better be a really good reason to  
> introduce a rule contrary to such an strongly implicit and intrinsic  
> concept.
>
> And as far as 'Keeping track of which fieldname had underscores", it  
> would seem that the current situation is the worst, because you have  
> to keep track not based on identity e.g. _rev and _id, but rather on  
> context, which is a dynamic and more intellectually demanding  
> concept than semantic identity. Furthermore, in this scheme, names  
> must be mapped under structural transformation (such as copying the  
> _id and _rev fields from one context to another), which complicates  
> generic transformations.
>
> IMO the name isn't "rev" with sometimes an underscore, rather the  
> name IS "_rev". Same with "_id".
>
> A single name for a concept, lexically consistent, is less cognitive  
> load both initially and on an ongoing basis.
>
> Antony Blakey
> -------------
> CTO, Linkuistics Pty Ltd
> Ph: 0438 840 787
>
> The ultimate measure of a man is not where he stands in moments of  
> comfort and convenience, but where he stands at times of challenge  
> and controversy.
>  -- Martin Luther King
>
>

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Antony Blakey <an...@gmail.com>.

On 29/12/2008, at 11:11 PM, Damien Katz wrote:

> The problem was there where other reserved fields in documents that  
> started with underscore, but in other places the fields wouldn't  
> have an underscore. Keep track of which fieldname had underscores  
> and where became confusing. The rule was changed to be simpler to  
> understand and deal with.

A simpler rule is: _rev is the name no matter where it appears, same  
with _id. I'd go so far as say that this kind of rule is so  
fundamental to our idea of identity and naming, that it doesn't even  
count as a rule. And there had better be a really good reason to  
introduce a rule contrary to such an strongly implicit and intrinsic  
concept.

And as far as 'Keeping track of which fieldname had underscores", it  
would seem that the current situation is the worst, because you have  
to keep track not based on identity e.g. _rev and _id, but rather on  
context, which is a dynamic and more intellectually demanding concept  
than semantic identity. Furthermore, in this scheme, names must be  
mapped under structural transformation (such as copying the _id and  
_rev fields from one context to another), which complicates generic  
transformations.

IMO the name isn't "rev" with sometimes an underscore, rather the name  
IS "_rev". Same with "_id".

A single name for a concept, lexically consistent, is less cognitive  
load both initially and on an ongoing basis.

Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

The ultimate measure of a man is not where he stands in moments of  
comfort and convenience, but where he stands at times of challenge and  
controversy.
   -- Martin Luther King

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.

If the metadata was separate from the "user data" everywhere, as it is  
in view results,  the problem just simply goes away. But I realize  
that won't be the solution.  :)

My $0.02 as a new user - I think I agree with Jan here - consistency  
is important.

I started really learning about CouchDB yesterday, so my memory of the  
"journey" is clear, and the "special fields" thing did throw a WTF  
exception... not because of the existence of _id and _rev in user data  
(that threw the "OhGeeze..." exception), but because they were  
inconsistently handled and presented in basic things like _all_docs/

The start of my thread yesterday ("newbie question #1") was really  
about that - why "id" *and* "_id",  "rev" *and* "_rev", how the  
presence of "key" in _all_docs/ seems to be a leak from how views  
work, why are view results so different than document results (e.g.  
why you don't inject _rev, _id_ and key into the data part of the  
view, why view/all is different format than db/_all_docs/...)

I do understand that as codebases evolve, APIs are going to change,  
and when it's OSS, you always accrete users depending on "whatever is  
there", so change is hard.

But this could be solved by versioning the API -  put something in a  
standard place in the URI to indicate that it's not the current API e.g.

  GET v1/somedatabase/_all_docs

and this gives you the chance offer other APIs (like one for us  
crackpots that insist that implementation metadata doesn't belong in  
the user data :)

Dunno.  Haven't used CouchDB in anger yet, so take all this w/ a grain  
of salt, but all my observations and what I hope are taken as  
constructive criticisms notwithstanding, I really am excited about  
CouchDB.

geir

On Dec 29, 2008, at 8:14 AM, Jan Lehnardt wrote:

>
> On 29 Dec 2008, at 13:41, Damien Katz wrote:
>
>> I disagree we should change this back. I don't know if anyone  
>> remembers, but this is how I implemented long ago in the first  
>> versions of the post-XML CouchDB. The problem was there where other  
>> reserved fields in documents that started with underscore, but in  
>> other places the fields wouldn't have an underscore. Keep track of  
>> which fieldname had underscores and where became confusing. The  
>> rule was changed to be simpler to understand and deal with. If it's  
>> in the root of a doc and it starts with underscore, it's reserved.  
>> You don't see the reserved underscore fields anywhere else, only in  
>> document top level.
>
> My issue with this is when learning about CouchDB, documents come  
> first and the "special fields are prefixed with an underscore" rule  
> is taken up naturally. Later with views and query parameters, this  
> rule is broken. You could argue that when we teach documents, we  
> should make the actual rule more explicit and that is certainly  
> true, but we don't control the ways people pick up CouchDB. We  
> control the API however and are able to reduce WTF-factors.
>
>
>> -Damien
>>
>>
>> On Dec 28, 2008, at 2:21 PM, Jan Lehnardt wrote:
>>
>>>
>>> On 28 Dec 2008, at 14:32, Antony Blakey wrote:
>>>
>>>>
>>>> On 28/12/2008, at 11:56 PM, Paul Davis wrote:
>>>>
>>>>> Why "id" and "rev" are used instead of "_id" and
>>>>> "_rev" I couldn't really tell you. I hate to say "historical  
>>>>> reasons"
>>>>> but I'm guessing that when Damien designed the view output he just
>>>>> labeled then "id" and "rev" without the underscore because it's  
>>>>> not
>>>>> needed to distinguish from the rest of the doc.
>>>>
>>>> Desirable to change that (and any other inconsistencies) before a  
>>>> 1.0
>>>
>>> This keeps coming up and I've been advocating this for a while now:
>>>
>>> +1 for changing view result rows `rev` to `_rev` to avoid confusion.
>>>
>>> CC'ing dev@c.a.o.
>>>
>>> Cheers
>>> Jan
>>> --
>>>
>>
>>
>

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.

If the metadata was separate from the "user data" everywhere, as it is  
in view results,  the problem just simply goes away. But I realize  
that won't be the solution.  :)

My $0.02 as a new user - I think I agree with Jan here - consistency  
is important.

I started really learning about CouchDB yesterday, so my memory of the  
"journey" is clear, and the "special fields" thing did throw a WTF  
exception... not because of the existence of _id and _rev in user data  
(that threw the "OhGeeze..." exception), but because they were  
inconsistently handled and presented in basic things like _all_docs/

The start of my thread yesterday ("newbie question #1") was really  
about that - why "id" *and* "_id",  "rev" *and* "_rev", how the  
presence of "key" in _all_docs/ seems to be a leak from how views  
work, why are view results so different than document results (e.g.  
why you don't inject _rev, _id_ and key into the data part of the  
view, why view/all is different format than db/_all_docs/...)

I do understand that as codebases evolve, APIs are going to change,  
and when it's OSS, you always accrete users depending on "whatever is  
there", so change is hard.

But this could be solved by versioning the API -  put something in a  
standard place in the URI to indicate that it's not the current API e.g.

  GET v1/somedatabase/_all_docs

and this gives you the chance offer other APIs (like one for us  
crackpots that insist that implementation metadata doesn't belong in  
the user data :)

Dunno.  Haven't used CouchDB in anger yet, so take all this w/ a grain  
of salt, but all my observations and what I hope are taken as  
constructive criticisms notwithstanding, I really am excited about  
CouchDB.

geir

On Dec 29, 2008, at 8:14 AM, Jan Lehnardt wrote:

>
> On 29 Dec 2008, at 13:41, Damien Katz wrote:
>
>> I disagree we should change this back. I don't know if anyone  
>> remembers, but this is how I implemented long ago in the first  
>> versions of the post-XML CouchDB. The problem was there where other  
>> reserved fields in documents that started with underscore, but in  
>> other places the fields wouldn't have an underscore. Keep track of  
>> which fieldname had underscores and where became confusing. The  
>> rule was changed to be simpler to understand and deal with. If it's  
>> in the root of a doc and it starts with underscore, it's reserved.  
>> You don't see the reserved underscore fields anywhere else, only in  
>> document top level.
>
> My issue with this is when learning about CouchDB, documents come  
> first and the "special fields are prefixed with an underscore" rule  
> is taken up naturally. Later with views and query parameters, this  
> rule is broken. You could argue that when we teach documents, we  
> should make the actual rule more explicit and that is certainly  
> true, but we don't control the ways people pick up CouchDB. We  
> control the API however and are able to reduce WTF-factors.
>
>
>> -Damien
>>
>>
>> On Dec 28, 2008, at 2:21 PM, Jan Lehnardt wrote:
>>
>>>
>>> On 28 Dec 2008, at 14:32, Antony Blakey wrote:
>>>
>>>>
>>>> On 28/12/2008, at 11:56 PM, Paul Davis wrote:
>>>>
>>>>> Why "id" and "rev" are used instead of "_id" and
>>>>> "_rev" I couldn't really tell you. I hate to say "historical  
>>>>> reasons"
>>>>> but I'm guessing that when Damien designed the view output he just
>>>>> labeled then "id" and "rev" without the underscore because it's  
>>>>> not
>>>>> needed to distinguish from the rest of the doc.
>>>>
>>>> Desirable to change that (and any other inconsistencies) before a  
>>>> 1.0
>>>
>>> This keeps coming up and I've been advocating this for a while now:
>>>
>>> +1 for changing view result rows `rev` to `_rev` to avoid confusion.
>>>
>>> CC'ing dev@c.a.o.
>>>
>>> Cheers
>>> Jan
>>> --
>>>
>>
>>
>

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Jan Lehnardt <ja...@apache.org>.

On 29 Dec 2008, at 13:41, Damien Katz wrote:

> I disagree we should change this back. I don't know if anyone  
> remembers, but this is how I implemented long ago in the first  
> versions of the post-XML CouchDB. The problem was there where other  
> reserved fields in documents that started with underscore, but in  
> other places the fields wouldn't have an underscore. Keep track of  
> which fieldname had underscores and where became confusing. The rule  
> was changed to be simpler to understand and deal with. If it's in  
> the root of a doc and it starts with underscore, it's reserved. You  
> don't see the reserved underscore fields anywhere else, only in  
> document top level.

My issue with this is when learning about CouchDB, documents come  
first and the "special fields are prefixed with an underscore" rule is  
taken up naturally. Later with views and query parameters, this rule  
is broken. You could argue that when we teach documents, we should  
make the actual rule more explicit and that is certainly true, but we  
don't control the ways people pick up CouchDB. We control the API  
however and are able to reduce WTF-factors.


> -Damien
>
>
> On Dec 28, 2008, at 2:21 PM, Jan Lehnardt wrote:
>
>>
>> On 28 Dec 2008, at 14:32, Antony Blakey wrote:
>>
>>>
>>> On 28/12/2008, at 11:56 PM, Paul Davis wrote:
>>>
>>>> Why "id" and "rev" are used instead of "_id" and
>>>> "_rev" I couldn't really tell you. I hate to say "historical  
>>>> reasons"
>>>> but I'm guessing that when Damien designed the view output he just
>>>> labeled then "id" and "rev" without the underscore because it's not
>>>> needed to distinguish from the rest of the doc.
>>>
>>> Desirable to change that (and any other inconsistencies) before a  
>>> 1.0
>>
>> This keeps coming up and I've been advocating this for a while now:
>>
>> +1 for changing view result rows `rev` to `_rev` to avoid confusion.
>>
>> CC'ing dev@c.a.o.
>>
>> Cheers
>> Jan
>> --
>>
>
>

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Jan Lehnardt <ja...@apache.org>.

On 29 Dec 2008, at 13:41, Damien Katz wrote:

> I disagree we should change this back. I don't know if anyone  
> remembers, but this is how I implemented long ago in the first  
> versions of the post-XML CouchDB. The problem was there where other  
> reserved fields in documents that started with underscore, but in  
> other places the fields wouldn't have an underscore. Keep track of  
> which fieldname had underscores and where became confusing. The rule  
> was changed to be simpler to understand and deal with. If it's in  
> the root of a doc and it starts with underscore, it's reserved. You  
> don't see the reserved underscore fields anywhere else, only in  
> document top level.

My issue with this is when learning about CouchDB, documents come  
first and the "special fields are prefixed with an underscore" rule is  
taken up naturally. Later with views and query parameters, this rule  
is broken. You could argue that when we teach documents, we should  
make the actual rule more explicit and that is certainly true, but we  
don't control the ways people pick up CouchDB. We control the API  
however and are able to reduce WTF-factors.


> -Damien
>
>
> On Dec 28, 2008, at 2:21 PM, Jan Lehnardt wrote:
>
>>
>> On 28 Dec 2008, at 14:32, Antony Blakey wrote:
>>
>>>
>>> On 28/12/2008, at 11:56 PM, Paul Davis wrote:
>>>
>>>> Why "id" and "rev" are used instead of "_id" and
>>>> "_rev" I couldn't really tell you. I hate to say "historical  
>>>> reasons"
>>>> but I'm guessing that when Damien designed the view output he just
>>>> labeled then "id" and "rev" without the underscore because it's not
>>>> needed to distinguish from the rest of the doc.
>>>
>>> Desirable to change that (and any other inconsistencies) before a  
>>> 1.0
>>
>> This keeps coming up and I've been advocating this for a while now:
>>
>> +1 for changing view result rows `rev` to `_rev` to avoid confusion.
>>
>> CC'ing dev@c.a.o.
>>
>> Cheers
>> Jan
>> --
>>
>
>

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Antony Blakey <an...@gmail.com>.

On 29/12/2008, at 11:11 PM, Damien Katz wrote:

> The problem was there where other reserved fields in documents that  
> started with underscore, but in other places the fields wouldn't  
> have an underscore. Keep track of which fieldname had underscores  
> and where became confusing. The rule was changed to be simpler to  
> understand and deal with.

A simpler rule is: _rev is the name no matter where it appears, same  
with _id. I'd go so far as say that this kind of rule is so  
fundamental to our idea of identity and naming, that it doesn't even  
count as a rule. And there had better be a really good reason to  
introduce a rule contrary to such an strongly implicit and intrinsic  
concept.

And as far as 'Keeping track of which fieldname had underscores", it  
would seem that the current situation is the worst, because you have  
to keep track not based on identity e.g. _rev and _id, but rather on  
context, which is a dynamic and more intellectually demanding concept  
than semantic identity. Furthermore, in this scheme, names must be  
mapped under structural transformation (such as copying the _id and  
_rev fields from one context to another), which complicates generic  
transformations.

IMO the name isn't "rev" with sometimes an underscore, rather the name  
IS "_rev". Same with "_id".

A single name for a concept, lexically consistent, is less cognitive  
load both initially and on an ongoing basis.

Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

The ultimate measure of a man is not where he stands in moments of  
comfort and convenience, but where he stands at times of challenge and  
controversy.
   -- Martin Luther King

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Damien Katz <da...@apache.org>.

I disagree we should change this back. I don't know if anyone  
remembers, but this is how I implemented long ago in the first  
versions of the post-XML CouchDB. The problem was there where other  
reserved fields in documents that started with underscore, but in  
other places the fields wouldn't have an underscore. Keep track of  
which fieldname had underscores and where became confusing. The rule  
was changed to be simpler to understand and deal with. If it's in the  
root of a doc and it starts with underscore, it's reserved. You don't  
see the reserved underscore fields anywhere else, only in document top  
level.

-Damien

On Dec 28, 2008, at 2:21 PM, Jan Lehnardt wrote:

>
> On 28 Dec 2008, at 14:32, Antony Blakey wrote:
>
>>
>> On 28/12/2008, at 11:56 PM, Paul Davis wrote:
>>
>>> Why "id" and "rev" are used instead of "_id" and
>>> "_rev" I couldn't really tell you. I hate to say "historical  
>>> reasons"
>>> but I'm guessing that when Damien designed the view output he just
>>> labeled then "id" and "rev" without the underscore because it's not
>>> needed to distinguish from the rest of the doc.
>>
>> Desirable to change that (and any other inconsistencies) before a 1.0
>
> This keeps coming up and I've been advocating this for a while now:
>
> +1 for changing view result rows `rev` to `_rev` to avoid confusion.
>
> CC'ing dev@c.a.o.
>
> Cheers
> Jan
> --
>

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Paul Davis <pa...@gmail.com>.

On Sun, Dec 28, 2008 at 2:21 PM, Jan Lehnardt <ja...@apache.org> wrote:
>
> On 28 Dec 2008, at 14:32, Antony Blakey wrote:
>
>>
>> On 28/12/2008, at 11:56 PM, Paul Davis wrote:
>>
>>> Why "id" and "rev" are used instead of "_id" and
>>> "_rev" I couldn't really tell you. I hate to say "historical reasons"
>>> but I'm guessing that when Damien designed the view output he just
>>> labeled then "id" and "rev" without the underscore because it's not
>>> needed to distinguish from the rest of the doc.
>>
>> Desirable to change that (and any other inconsistencies) before a 1.0
>
> This keeps coming up and I've been advocating this for a while now:
>
> +1 for changing view result rows `rev` to `_rev` to avoid confusion.
>

+1 too. Consistency FTW

> CC'ing dev@c.a.o.
>
> Cheers
> Jan
> --
>
>

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Jan Lehnardt <ja...@apache.org>.

On 30 Dec 2008, at 12:22, Sven Helmberger wrote:

> Jan Lehnardt schrieb:
>> On 30 Dec 2008, at 12:04, Sven Helmberger wrote:
>>> Antony Blakey schrieb:
>>>> It's easy to get this right and make everything simpler to use  
>>>> (by leveraging fundamental congnitive expectations such as name  
>>>> identity) and extensible. So far I haven't seen any good  
>>>> technical argument why either name identity (_id/_rev  
>>>> everywhere), or _meta, shouldn't be adopted.
>>>
>>> We also haven't seen a good technical argument why the current  
>>> state should be changed.
>> There is no technical argument. The argument is beginner- 
>> friendlyness.
>
> And we can be a lot more beginner-friendly by just improving the  
> documentation quality, something which is needed anyway.

Hi Sven,

please check dev@c.a.o for the entire discussion. I don't want to re- 
iterate everything again. Just briefly: We don't control how people  
learn CouchDB, but we control the API. You are certainly right that  
our documentation should be decent, but a nicely designed API is the  
first step to get there (note that I am not arguing either point of  
the original discussion). Just having good documentation on a horrible  
API is not going to cut it (See PHP's strstr(), preg_match(), ereg()  
argument order. It's a perfect example of good docs for a  
bad^Worganically grown API).

Cheers
Jan
--

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Sven Helmberger <sv...@gmx.de>.

Jan Lehnardt schrieb:
> 
> On 30 Dec 2008, at 12:04, Sven Helmberger wrote:
> 
>> Antony Blakey schrieb:
>>> It's easy to get this right and make everything simpler to use (by 
>>> leveraging fundamental congnitive expectations such as name identity) 
>>> and extensible. So far I haven't seen any good technical argument why 
>>> either name identity (_id/_rev everywhere), or _meta, shouldn't be 
>>> adopted.
>>
>> We also haven't seen a good technical argument why the current state 
>> should be changed.
> 
> There is no technical argument. The argument is beginner-friendlyness.
> 

And we can be a lot more beginner-friendly by just improving the 
documentation quality, something which is needed anyway.

Regards,
Sven Helmberger

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Jan Lehnardt <ja...@apache.org>.

On 30 Dec 2008, at 12:04, Sven Helmberger wrote:

> Antony Blakey schrieb:
>> It's easy to get this right and make everything simpler to use (by  
>> leveraging fundamental congnitive expectations such as name  
>> identity) and extensible. So far I haven't seen any good technical  
>> argument why either name identity (_id/_rev everywhere), or _meta,  
>> shouldn't be adopted.
>
> We also haven't seen a good technical argument why the current state  
> should be changed.

There is no technical argument. The argument is beginner-friendlyness.

Cheers
Jan
--

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Sven Helmberger <sv...@gmx.de>.

Antony Blakey schrieb:
> 
> It's easy to get this right and make everything simpler to use (by 
> leveraging fundamental congnitive expectations such as name identity) 
> and extensible. So far I haven't seen any good technical argument why 
> either name identity (_id/_rev everywhere), or _meta, shouldn't be adopted.
> 

We also haven't seen a good technical argument why the current state 
should be changed.

Regards,
Sven Helmberger

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Antony Blakey <an...@gmail.com>.

On 30/12/2008, at 9:51 AM, Noah Slater wrote:

> I can see it from both angles, of course we should be discussing  
> things on
> technical merit and not some fluffy concept of who's better at  
> predicting
> things. On the other hand, Damien has a significant head-start on us  
> all with
> CouchDB and Lotus Notes before it, so if I was going to trust  
> anyone's intuition
> it would be his.

Surely design is better than intuition?

There are general principles at work here, about enabling  
extensibility and presuming that we cannot predict the future. A  
design that enables extensibility, and allows for unanticipated uses  
and emergent properties, is superior to a design that is deliberately  
made brittle for expedient reasons, backed-up by an assertion that a  
guiding hand knows how this will evolve and be used for all time.

It's easy to get this right and make everything simpler to use (by  
leveraging fundamental congnitive expectations such as name identity)  
and extensible. So far I haven't seen any good technical argument why  
either name identity (_id/_rev everywhere), or _meta, shouldn't be  
adopted.

Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

The difference between ordinary and extraordinary is that little extra.

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Jan Lehnardt <ja...@apache.org>.

On 30 Dec 2008, at 00:21, Noah Slater wrote:

> On Tue, Dec 30, 2008 at 09:47:23AM +1030, Antony Blakey wrote:
>>
>> On 30/12/2008, at 4:11 AM, Dean Landolt wrote:
>>
>>> Of course, given the lore around here that Damien has this thing
>>> designed in his head all the way through 2.0, I think it's a safe  
>>> bet
>>> metadata would remain fairly stable for some time to come.
>>
>> I don't thing that's very good reasoning.
>>
>> What about version 3, 4, ... etc? Why not design it to be stable and
>> extensible *by design*, rather than attributing god-like predictive
>> powers to someone.
>
> I can see it from both angles, of course we should be discussing  
> things on
> technical merit and not some fluffy concept of who's better at  
> predicting
> things. On the other hand, Damien has a significant head-start on us  
> all with
> CouchDB and Lotus Notes before it, so if I was going to trust  
> anyone's intuition
> it would be his.

+1.

—

Damien's reasoning is not inconsistent, it just includes one newbie-WTF.
While I'm still for changing it, if it doesn't happen, we just need to  
make sure
TO DOCUMENT THE HELL OUT OF IT.

I got shot down with this proposals three times now — I give up :)

Cheers
Jan
--

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Jan Lehnardt <ja...@apache.org>.

On 30 Dec 2008, at 00:21, Noah Slater wrote:

> On Tue, Dec 30, 2008 at 09:47:23AM +1030, Antony Blakey wrote:
>>
>> On 30/12/2008, at 4:11 AM, Dean Landolt wrote:
>>
>>> Of course, given the lore around here that Damien has this thing
>>> designed in his head all the way through 2.0, I think it's a safe  
>>> bet
>>> metadata would remain fairly stable for some time to come.
>>
>> I don't thing that's very good reasoning.
>>
>> What about version 3, 4, ... etc? Why not design it to be stable and
>> extensible *by design*, rather than attributing god-like predictive
>> powers to someone.
>
> I can see it from both angles, of course we should be discussing  
> things on
> technical merit and not some fluffy concept of who's better at  
> predicting
> things. On the other hand, Damien has a significant head-start on us  
> all with
> CouchDB and Lotus Notes before it, so if I was going to trust  
> anyone's intuition
> it would be his.

+1.

—

Damien's reasoning is not inconsistent, it just includes one newbie-WTF.
While I'm still for changing it, if it doesn't happen, we just need to  
make sure
TO DOCUMENT THE HELL OUT OF IT.

I got shot down with this proposals three times now — I give up :)

Cheers
Jan
--

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Noah Slater <ns...@apache.org>.

On Tue, Dec 30, 2008 at 11:08:03AM +1030, Antony Blakey wrote:
> IMO you are asserting intuition over reason, which doesn't feel like
> engineering to me.

I wrote:

> I can see it from both angles, of course we should be discussing things on
> technical merit and not some fluffy concept of who's better at predicting
> things. On the other hand, Damien has a significant head-start on us all with
> CouchDB and Lotus Notes before it, so if I was going to trust anyone's
> intuition it would be his.

I think I made it pretty clear that I consider intuition to be a reasonable
fall-back decider in lieu of convincing technical argument. Anyway...

> In any case, I didn't mean it personally, sorry.

Not a problem.

-- 
Noah Slater, http://tumbolia.org/nslater

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Antony Blakey <an...@gmail.com>.

On 30/12/2008, at 10:57 AM, Noah Slater wrote:

> On Tue, Dec 30, 2008 at 10:19:58AM +1030, Antony Blakey wrote:
>>
>> On 30/12/2008, at 9:51 AM, Noah Slater wrote:
>>
>>> Damien has a significant head-start on us all with
>>> CouchDB and Lotus Notes before it, so if I was going to trust  
>>> anyone's
>>> intuition
>>> it would be his.
>>
>> IMO the principles here have nothing to do with either Couch or  
>> Notes.
>> They have more to do with general API design principles. I have 31  
>> years
>> experience with issues like this, in a very wide variety of  
>> languages and
>> contexts, but I'm not asking anyone to accept anything I assert on  
>> that
>> basis.
>>
>> Are we professional engineers or mystics?
>
> Come on, I clearly outlined my position. Please don't  
> mischaracterise me.

IMO you are asserting intuition over reason, which doesn't feel like  
engineering to me.

In any case, I didn't mean it personally, sorry.

Antony Blakey
--------------------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

Lack of will power has caused more failure than lack of intelligence  
or ability.
  -- Flower A. Newhouse

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Noah Slater <ns...@apache.org>.

On Tue, Dec 30, 2008 at 10:19:58AM +1030, Antony Blakey wrote:
>
> On 30/12/2008, at 9:51 AM, Noah Slater wrote:
>
>> Damien has a significant head-start on us all with
>> CouchDB and Lotus Notes before it, so if I was going to trust anyone's
>> intuition
>> it would be his.
>
> IMO the principles here have nothing to do with either Couch or Notes.
> They have more to do with general API design principles. I have 31 years
> experience with issues like this, in a very wide variety of languages and
> contexts, but I'm not asking anyone to accept anything I assert on that
> basis.
>
> Are we professional engineers or mystics?

Come on, I clearly outlined my position. Please don't mischaracterise me.

-- 
Noah Slater, http://tumbolia.org/nslater

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Antony Blakey <an...@gmail.com>.

On 30/12/2008, at 9:51 AM, Noah Slater wrote:

> Damien has a significant head-start on us all with
> CouchDB and Lotus Notes before it, so if I was going to trust  
> anyone's intuition
> it would be his.

IMO the principles here have nothing to do with either Couch or Notes.  
They have more to do with general API design principles. I have 31  
years experience with issues like this, in a very wide variety of  
languages and contexts, but I'm not asking anyone to accept anything I  
assert on that basis.

Are we professional engineers or mystics?

Antony Blakey
--------------------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

Always have a vision. Why spend your life making other people’s dreams?
  -- Orson Welles (1915-1985)

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Noah Slater <ns...@apache.org>.

On Tue, Dec 30, 2008 at 09:47:23AM +1030, Antony Blakey wrote:
>
> On 30/12/2008, at 4:11 AM, Dean Landolt wrote:
>
>> Of course, given the lore around here that Damien has this thing
>> designed in his head all the way through 2.0, I think it's a safe bet
>> metadata would remain fairly stable for some time to come.
>
> I don't thing that's very good reasoning.
>
> What about version 3, 4, ... etc? Why not design it to be stable and
> extensible *by design*, rather than attributing god-like predictive
> powers to someone.

I can see it from both angles, of course we should be discussing things on
technical merit and not some fluffy concept of who's better at predicting
things. On the other hand, Damien has a significant head-start on us all with
CouchDB and Lotus Notes before it, so if I was going to trust anyone's intuition
it would be his.

-- 
Noah Slater, http://tumbolia.org/nslater

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Antony Blakey <an...@gmail.com>.

On 30/12/2008, at 4:11 AM, Dean Landolt wrote:

> Of course, given the lore around here that Damien has this thing
> designed in his head all the way through 2.0, I think it's a safe bet
> metadata would remain fairly stable for some time to come.

I don't thing that's very good reasoning.

What about version 3, 4, ... etc? Why not design it to be stable and  
extensible *by design*, rather than attributing god-like predictive  
powers to someone.

Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

If you pick up a starving dog and make him prosperous, he will not  
bite you. This is the principal difference between a man and a dog.
   -- Mark Twain

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Dean Landolt <de...@deanlandolt.com>.

On Mon, Dec 29, 2008 at 11:53 AM, ara.t.howard <ar...@gmail.com>wrote:

>
> On Dec 29, 2008, at 1:50 AM, Christopher Lenz wrote:
>
>  I think this is the wrong direction. If these naming issues really are
>> generating substantial confusion (which I doubt), we should rather be
>> looking into changing the mechanism for including meta-information with
>> documents so that the leading underscores could be dropped across the board.
>> Maybe something like:
>>
>>
>>  {
>>   "_meta": {
>>     "id": "foo",
>>     "rev": "1234"
>>   },
>>   "title": "Foo",
>>   ...
>>  }
>>
>> This would also cause client breakage, but at least we wouldn't be
>> scattering more underscores around the API.
>>
>
>
> i must say i mostly agree with geir and support the above style whole
> heartedly.  it would simplify writing client libraries *massively*.
>  currently client code will end up looking like
>
>  if doc['_rev']
>    ...
>
>  if doc['_id']
>    ...
>
>
> for instance
>
>
> cfp:~/src/git/couchrest > grep _id lib/couchrest/core/model.rb |grep _rev
>      unless self['_id'] && self['_rev']
>    # Removes the <tt>_id</tt> and <tt>_rev</tt> fields, preparing the
>
>
> and this will only get more confusing as time goes on and the number of
> metadata fields increases.  it's a PITA for couch libs  to have to track
> these changes over time and seriously hampers forward/backwards compat,
> practically ensuring migration issues for each and every client lib and
> document stored.
>
> the beauty of '_meta' (as above - and although i'd propose simply '_') is
> that it minimally but completely represents the concept couch docs need to
> express which is simply that "there will be meta data in the doc."  occam's
> razor alone should be proof that this is the best method of piggybacking
> that inside a couch/json doc.
>
> one thing people haven't seem concerned with is that of meta data being
> flat: it seems quite likely that, over time, metadata in docs would evolve
> to have trees of information and in that case the current rules will lead to
> slightly ugly structures with required leading underscores at the root but
> optional ones after.
>
> another thing which is entirely imaginable is some sort of server side
> merge operation which couch's current metadata scheme cannot handle.  for
> instance if you ever had to merge
>
>  { '_id' : 1, 'value' : 40 }
>  { '_id' : 2, 'value' : 42 }
>
> you'd have to do something like
>
>  { '_id1' : 1, '_id2' : 2 }
>
> but a '_' scheme could simply return one metadata element per doc in an
> array
>
>  { '_' : [ metadata1, metadata2 ] }
>
> or any other nested structure without needing to pollute the top namespace
>
>
>
> there are actually *many* precedents for moving towards a single _meta/_
> key in repos out there, rails' handling of options in methods springs to
> mind.  they started out doing things like
>
>  some_method options = {}
>
> where certain keys would be skimmed out from options and the rest passed
> through to be, for instance, html attributes or some such but increasingly
> they are moving towards signatures like
>
>  some_method :key => :value, :html => {:id => 'foo', :class => 'bar'}
>
> because mixing the semantics of keys in a single hash leading to spaghetti
> code, both in core and client, as the process of accretion takes hold.
>
>
> anyhow, my 2cts is that the current rules for couch are fairly simple, but
> they can in fact be, and forever remain, an both an order of magnitude
> simpler and completely consistent.

I tend to agree that the rules are damn simple now but with time and the
possible buildup of meta cruft, they could get pretty complicated pretty
fast. Of course, given the lore around here that Damien has this thing
designed in his head all the way through 2.0, I think it's a safe bet
metadata would remain fairly stable for some time to come.

Still, just to throw my two pence in, my company's been working quite a bit
with MarkLogic (in spite of my many objections) and their handling of
metadata is pretty flexible. As an xml db, everything's, of course, xml, but
the principles remain -- metadata, or properties in their parlance, are
entirely separate from the actual documents, but reside at the same uri.
There are special api hooks to get at the properties (which I think is
pretty weak, and wouldn't work for couch anyway), but what's sexy is that
since the properties docs are just xml, you can tack on your own properties
all you want, including nesting them, and get at them through the apis or
along a specially-exposed xpath axis. They just reserve a few properties of
their own that you can't step on, but otherwise you can hook into their
metadata facilities in a very flexible way.

While this kind of ultimate flexibility may not be necessary for couch --
given the nature of json docs they have no trouble expressing their
properties inline -- it could certainly be helpful to maintain app-level
housekeeping without polluting the documents themselves. So the notion of
just one reserved attribute, be it "_meta" or "_" or whatever, would be a
perfectly serviceable solution that could yield a few benefits down the
road, like the ability to pass in your own _meta objects (so long as your
attributes don't begin with '_', for instance, reserving these for future
couch properties). So inside _meta (or whatever) _id can become id (because
it's user-definable) and _rev stays _rev (unless that should be definable as
well) and everything else falls into place (except for the massive overhaul
that will probably be needed in the core...ouch!)...

Then just like include_docs there could be an include_meta, or some such, to
offer a more efficient way to get at important data in the view apis without
yanking in the whole doc, or it could just be included by default like it is
now, just in a _meta envelope. It could also help facilitate plugins,
offering a doc-level place for info about, but not part of, a doc (which is
pretty much the embodiment of metadata).

Just a thought...

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by "ara.t.howard" <ar...@gmail.com>.

On Dec 29, 2008, at 1:50 AM, Christopher Lenz wrote:

> I think this is the wrong direction. If these naming issues really  
> are generating substantial confusion (which I doubt), we should  
> rather be looking into changing the mechanism for including meta- 
> information with documents so that the leading underscores could be  
> dropped across the board. Maybe something like:
>
>
>  {
>    "_meta": {
>      "id": "foo",
>      "rev": "1234"
>    },
>    "title": "Foo",
>    ...
>  }
>
> This would also cause client breakage, but at least we wouldn't be  
> scattering more underscores around the API.

i must say i mostly agree with geir and support the above style whole  
heartedly.  it would simplify writing client libraries *massively*.   
currently client code will end up looking like

   if doc['_rev']
     ...

   if doc['_id']
     ...

for instance

cfp:~/src/git/couchrest > grep _id lib/couchrest/core/model.rb |grep  
_rev
       unless self['_id'] && self['_rev']
     # Removes the <tt>_id</tt> and <tt>_rev</tt> fields, preparing the

and this will only get more confusing as time goes on and the number  
of metadata fields increases.  it's a PITA for couch libs  to have to  
track these changes over time and seriously hampers forward/backwards  
compat, practically ensuring migration issues for each and every  
client lib and document stored.

the beauty of '_meta' (as above - and although i'd propose simply '_')  
is that it minimally but completely represents the concept couch docs  
need to express which is simply that "there will be meta data in the  
doc."  occam's razor alone should be proof that this is the best  
method of piggybacking that inside a couch/json doc.

one thing people haven't seem concerned with is that of meta data  
being flat: it seems quite likely that, over time, metadata in docs  
would evolve to have trees of information and in that case the current  
rules will lead to slightly ugly structures with required leading  
underscores at the root but optional ones after.

another thing which is entirely imaginable is some sort of server side  
merge operation which couch's current metadata scheme cannot handle.   
for instance if you ever had to merge

   { '_id' : 1, 'value' : 40 }
   { '_id' : 2, 'value' : 42 }

you'd have to do something like

   { '_id1' : 1, '_id2' : 2 }

but a '_' scheme could simply return one metadata element per doc in  
an array

   { '_' : [ metadata1, metadata2 ] }

or any other nested structure without needing to pollute the top  
namespace

there are actually *many* precedents for moving towards a single _meta/ 
_ key in repos out there, rails' handling of options in methods  
springs to mind.  they started out doing things like

   some_method options = {}

where certain keys would be skimmed out from options and the rest  
passed through to be, for instance, html attributes or some such but  
increasingly they are moving towards signatures like

   some_method :key => :value, :html => {:id => 'foo', :class => 'bar'}

because mixing the semantics of keys in a single hash leading to  
spaghetti code, both in core and client, as the process of accretion  
takes hold.

anyhow, my 2cts is that the current rules for couch are fairly simple,  
but they can in fact be, and forever remain, an both an order of  
magnitude simpler and completely consistent.

a @ http://codeforpeople.com/
--
we can deny everything, except that we have the possibility of being  
better. simply reflect on that.
h.h. the 14th dalai lama

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Jan Lehnardt <ja...@apache.org>.

On 29 Dec 2008, at 11:48, Noah Slater wrote:

> On Mon, Dec 29, 2008 at 09:11:29PM +1030, Antony Blakey wrote:
>>
>> On 29/12/2008, at 7:20 PM, Christopher Lenz wrote:
>>
>>> This would break all substantial client code out there for
>>> questionable benefit.
>>
>> It's not even a released product yet. Preserving accidental
>> implementation artifacts to satisfy early pre-release clients is a  
>> very
>> sad idea.
>
> Come again? CouchDB is most certainly a released product.
>
> I understand the concepts about being able to break backwards  
> incompatibility
> before the legendary 1.0 release of a free software project, but  
> categorically
> denying there is nothing to consider is wildly misleading!

We've been very clear in the past that we might break the API before  
1.0 (which
Damien narrowed down to 0.9 now, which is okay). I'm not categorically  
denying
that there's client code out there, but we document the changes well  
and the
changes required should be minimal and easy to do.

At this stage, client code should not be a concern here.

Cheers
Jan
--

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Noah Slater <ns...@apache.org>.

On Mon, Dec 29, 2008 at 09:47:36PM +1030, Antony Blakey wrote:
>
> On 29/12/2008, at 9:18 PM, Noah Slater wrote:
>
>> I understand the concepts about being able to break backwards
>> incompatibility
>> before the legendary 1.0 release of a free software project, but
>> categorically
>> denying there is nothing to consider is wildly misleading!
>
> I think a 1.0 release is not especially significant to free software
> projects compared to non-free projects.
>
> Even with commercial software and internal releases there are issues
> related to change management and consequent costs of compatibility
> breaking, so I don't think there's nothing to consider, but Christopher
> was commenting about 'substantial client code out there', which to me
> sounds like an argument I believe is only significant for 'released'
> versions. Without some distinction between 'released' and
> 'no-guarantees-work-in-progress', you can't experiment with changes.

I'm not sure I see your point. CouchDB has a whole truck load of third party
clients and community code that would all instantly break. That's not to say we
shouldn't do it, but we should certainly consider the consequences.

-- 
Noah Slater, http://tumbolia.org/nslater

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Antony Blakey <an...@gmail.com>.

On 29/12/2008, at 9:18 PM, Noah Slater wrote:

> I understand the concepts about being able to break backwards  
> incompatibility
> before the legendary 1.0 release of a free software project, but  
> categorically
> denying there is nothing to consider is wildly misleading!

I think a 1.0 release is not especially significant to free software  
projects compared to non-free projects.

Even with commercial software and internal releases there are issues  
related to change management and consequent costs of compatibility  
breaking, so I don't think there's nothing to consider, but  
Christopher was commenting about 'substantial client code out there',  
which to me sounds like an argument I believe is only significant for  
'released' versions. Without some distinction between 'released' and  
'no-guarantees-work-in-progress', you can't experiment with changes.

Surely 1.0 means something - I assert that there are very strong  
expectations about backwards compatibility within major point  
releases, and the pre/post-1.0 transition is very significant in that  
respect.

Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

Some defeats are instalments to victory.
   -- Jacob Riis

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Noah Slater <ns...@apache.org>.

On Mon, Dec 29, 2008 at 09:11:29PM +1030, Antony Blakey wrote:
>
> On 29/12/2008, at 7:20 PM, Christopher Lenz wrote:
>
>> This would break all substantial client code out there for
>> questionable benefit.
>
> It's not even a released product yet. Preserving accidental
> implementation artifacts to satisfy early pre-release clients is a very
> sad idea.

Come again? CouchDB is most certainly a released product.

I understand the concepts about being able to break backwards incompatibility
before the legendary 1.0 release of a free software project, but categorically
denying there is nothing to consider is wildly misleading!

-- 
Noah Slater, http://tumbolia.org/nslater

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Antony Blakey <an...@gmail.com>.

On 29/12/2008, at 7:20 PM, Christopher Lenz wrote:

> This would break all substantial client code out there for  
> questionable benefit.

It's not even a released product yet. Preserving accidental  
implementation artifacts to satisfy early pre-release clients is a  
very sad idea.

> I don't think there's anything inconsistent here, or anything to be  
> confused about. The rule is simple: reserved fields *inside*  
> documents have a leading underscore as to not clash with user fields  
> (as user fields must not have a leading underscore at the top-level  
> of the document).

Why have any rule about 'inside' or 'outside' a document - just use  
the same name everywhere.

> There are a couple of meta-data fields that are packed into  
> documents, such as "id" and "rev". Now just because those fields get  
> the leading-underscore treatment in documents doesn't mean they need  
> a leading underscore whenever they appear in other places, such as  
> view results.

By using a consistent name for fields you reduce confusion and  
emphasise that they have the same semantic type. Your argument is  
based on the underscore not actually being part of the name, but  
rather a removable qualifier that is used to reduce the chances of  
collision. Contextually sensitive name transformations in an API are  
an unnecessary complication.

> Then the question is, where would you stop? Would you also rename  
> "key" to "_key", "value" to "_value", and so on, for consistency?  
> What about the "?rev=1234" query-string parameter? We could get to a  
> point where every term used in the API will have a leading  
> underscore just "to be safe" :P

Both key and value never appear in a couch-metadata role within a  
document. I think _rev in the query string parameter is a good idea  
for consistency. Whenever a _rev appears, it's name should be _rev.  
Same for the _id of a document.

> I think this is the wrong direction. If these naming issues really  
> are generating substantial confusion (which I doubt), we should  
> rather be looking into changing the mechanism for including meta- 
> information with documents so that the leading underscores could be  
> dropped across the board.

It's not about *substantial* confusion - I'd like to see as much  
consistency and regularity in the API as possible before a 1.0  
release; said release being IMO a point at which backwards compatible  
cost-benefit concerns become appropriate.

> Maybe something like:
>
>
>  {
>    "_meta": {
>      "id": "foo",
>      "rev": "1234"
>    },
>    "title": "Foo",
>    ...
>  }
>
> This would also cause client breakage, but at least we wouldn't be  
> scattering more underscores around the API.

I really like this idea.

> Oh, and if this thread was actually only about the "rev" field in  
> the review results of _all_docs (note it's *not* in all view  
> results), why not just drop it instead? Is there any practical  
> reason it's in there?

I think it should either be in every view result that has an id, or in  
none. In so many cases you need the id and the rev to represent point- 
in-time identity, I think they should be supplied together whenever  
feasible.

Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

A Buddhist walks up to a hot-dog stand and says, "Make me one with  
everything". He then pays the vendor and asks for change. The vendor  
says, "Change comes from within".

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Christopher Lenz <cm...@gmx.de>.

On 28.12.2008, at 20:21, Jan Lehnardt wrote:
> On 28 Dec 2008, at 14:32, Antony Blakey wrote:
>> On 28/12/2008, at 11:56 PM, Paul Davis wrote:
>>
>>> Why "id" and "rev" are used instead of "_id" and
>>> "_rev" I couldn't really tell you. I hate to say "historical  
>>> reasons"
>>> but I'm guessing that when Damien designed the view output he just
>>> labeled then "id" and "rev" without the underscore because it's not
>>> needed to distinguish from the rest of the doc.
>>
>> Desirable to change that (and any other inconsistencies) before a 1.0
>
> This keeps coming up and I've been advocating this for a while now:
>
> +1 for changing view result rows `rev` to `_rev` to avoid confusion.

(The following assumes that you'd also want to change "id" to "_id"...)

-1

This would break all substantial client code out there for  
questionable benefit.

I don't think there's anything inconsistent here, or anything to be  
confused about. The rule is simple: reserved fields *inside* documents  
have a leading underscore as to not clash with user fields (as user  
fields must not have a leading underscore at the top-level of the  
document). There are a couple of meta-data fields that are packed into  
documents, such as "id" and "rev". Now just because those fields get  
the leading-underscore treatment in documents doesn't mean they need a  
leading underscore whenever they appear in other places, such as view  
results.

Then the question is, where would you stop? Would you also rename  
"key" to "_key", "value" to "_value", and so on, for consistency? What  
about the "?rev=1234" query-string parameter? We could get to a point  
where every term used in the API will have a leading underscore just  
"to be safe" :P

I think this is the wrong direction. If these naming issues really are  
generating substantial confusion (which I doubt), we should rather be  
looking into changing the mechanism for including meta-information  
with documents so that the leading underscores could be dropped across  
the board. Maybe something like:

   {
     "_meta": {
       "id": "foo",
       "rev": "1234"
     },
     "title": "Foo",
     ...
   }

This would also cause client breakage, but at least we wouldn't be  
scattering more underscores around the API.

Oh, and if this thread was actually only about the "rev" field in the  
review results of _all_docs (note it's *not* in all view results), why  
not just drop it instead? Is there any practical reason it's in there?

Cheers,
Chris
--
Christopher Lenz
   cmlenz at gmx.de
   http://www.cmlenz.net/

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Christopher Lenz <cm...@gmx.de>.

On 28.12.2008, at 20:21, Jan Lehnardt wrote:
> On 28 Dec 2008, at 14:32, Antony Blakey wrote:
>> On 28/12/2008, at 11:56 PM, Paul Davis wrote:
>>
>>> Why "id" and "rev" are used instead of "_id" and
>>> "_rev" I couldn't really tell you. I hate to say "historical  
>>> reasons"
>>> but I'm guessing that when Damien designed the view output he just
>>> labeled then "id" and "rev" without the underscore because it's not
>>> needed to distinguish from the rest of the doc.
>>
>> Desirable to change that (and any other inconsistencies) before a 1.0
>
> This keeps coming up and I've been advocating this for a while now:
>
> +1 for changing view result rows `rev` to `_rev` to avoid confusion.

(The following assumes that you'd also want to change "id" to "_id"...)

-1

This would break all substantial client code out there for  
questionable benefit.

I don't think there's anything inconsistent here, or anything to be  
confused about. The rule is simple: reserved fields *inside* documents  
have a leading underscore as to not clash with user fields (as user  
fields must not have a leading underscore at the top-level of the  
document). There are a couple of meta-data fields that are packed into  
documents, such as "id" and "rev". Now just because those fields get  
the leading-underscore treatment in documents doesn't mean they need a  
leading underscore whenever they appear in other places, such as view  
results.

Then the question is, where would you stop? Would you also rename  
"key" to "_key", "value" to "_value", and so on, for consistency? What  
about the "?rev=1234" query-string parameter? We could get to a point  
where every term used in the API will have a leading underscore just  
"to be safe" :P

I think this is the wrong direction. If these naming issues really are  
generating substantial confusion (which I doubt), we should rather be  
looking into changing the mechanism for including meta-information  
with documents so that the leading underscores could be dropped across  
the board. Maybe something like:

   {
     "_meta": {
       "id": "foo",
       "rev": "1234"
     },
     "title": "Foo",
     ...
   }

This would also cause client breakage, but at least we wouldn't be  
scattering more underscores around the API.

Oh, and if this thread was actually only about the "rev" field in the  
review results of _all_docs (note it's *not* in all view results), why  
not just drop it instead? Is there any practical reason it's in there?

Cheers,
Chris
--
Christopher Lenz
   cmlenz at gmx.de
   http://www.cmlenz.net/

Re: Changing rev to _rev in view results (Was: Re: newbie question #1)

Posted by Damien Katz <da...@apache.org>.

I disagree we should change this back. I don't know if anyone  
remembers, but this is how I implemented long ago in the first  
versions of the post-XML CouchDB. The problem was there where other  
reserved fields in documents that started with underscore, but in  
other places the fields wouldn't have an underscore. Keep track of  
which fieldname had underscores and where became confusing. The rule  
was changed to be simpler to understand and deal with. If it's in the  
root of a doc and it starts with underscore, it's reserved. You don't  
see the reserved underscore fields anywhere else, only in document top  
level.

-Damien

On Dec 28, 2008, at 2:21 PM, Jan Lehnardt wrote:

>
> On 28 Dec 2008, at 14:32, Antony Blakey wrote:
>
>>
>> On 28/12/2008, at 11:56 PM, Paul Davis wrote:
>>
>>> Why "id" and "rev" are used instead of "_id" and
>>> "_rev" I couldn't really tell you. I hate to say "historical  
>>> reasons"
>>> but I'm guessing that when Damien designed the view output he just
>>> labeled then "id" and "rev" without the underscore because it's not
>>> needed to distinguish from the rest of the doc.
>>
>> Desirable to change that (and any other inconsistencies) before a 1.0
>
> This keeps coming up and I've been advocating this for a while now:
>
> +1 for changing view result rows `rev` to `_rev` to avoid confusion.
>
> CC'ing dev@c.a.o.
>
> Cheers
> Jan
> --
>