You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Daniel T <da...@gmail.com> on 2009/02/05 18:52:17 UTC

Database Disruption From The Couch

This is a blog entry I wrote about CouchDB about a month ago... been meaning
to post it here
http://danieltsadok.wordpress.com/2008/12/22/database-disruption-from-the-couch/

:-Daniel Tsadok

Re: Database Disruption From The Couch

Posted by Noah Slater <ns...@apache.org>.
On Thu, Feb 05, 2009 at 12:52:17PM -0500, Daniel T wrote:
> This is a blog entry I wrote about CouchDB about a month ago... been meaning
> to post it here
> http://danieltsadok.wordpress.com/2008/12/22/database-disruption-from-the-couch/

Have you considered adding your blog to Planet CouchDB?

-- 
Noah Slater, http://tumbolia.org/nslater

Re: Database Disruption From The Couch

Posted by Jeremy Wall <jw...@google.com>.
On Sun, Feb 8, 2009 at 3:25 PM, Brian Candler <B....@pobox.com> wrote:

> On Fri, Feb 06, 2009 at 10:10:29AM +0100, Andrea Schiavini wrote:
> > Daniel, I've read your post, and it is definitely interesting. However I
> > don't agree with you, when you say that CouchDB makes Rails obsolete.
> Rails
> > is not only ActiveRecord, it's a model-view-controller framework that
> helps
> > hugely during the development of big applications. The target of Rails is
> > not to merely interact with the db, but to provide a good framework to
> > handle a complex application that *also* interacts with a DB.
>
> I have been working through some ideas related to this over the weekend and
> playing with some code. I have a few points which I'd like to raise for
> discussion.
>
> (1) Personally, I'm not yet ready to move all view generation into the
> browser (Futon-like), i.e. where Javascript fetches the JSON, reformats as
> HTML, and submits back as JSON. In any case, supporting browsers without
> Javascript is still a useful capability.
>
> So for now at least, I still need a layer which will build HTML from a
> document, and allow document create/update via form POST.
>
> (2) I need an application logic layer, which enforces business rules.
>
> Whilst couchdb may be gaining some features which could be used for 1 and 2
> (_show and _action?) I don't think that Couchdb is necessarily the right
> place to put this logic. In particular, I don't want to write lots of
> application logic in Javascript, especially without a decent testing
> framework which is independent of any browser and of Java.


Just wanted to plug Test.TAP here. We use it on the Joose project and it's
completely environment agnostic. We test joose in .NET, Rhino and all
browsers using it. So the testing framework might just be there for you.
http://code.google.com/p/test-tap/

I built it specifically because I wanted something that didn't need a
browser to work.

Also Joose makes building application logic a little bit easier if you like
Meta-Object programming styles.


>
>
> Put another way: document *storage/replication* and application
> *processing*
> of documents can (and in many cases probably should) remain separate.
> Couchapps remain an interesting idea however.
>
> (3) I do like the idea of non-HTML clients being able to read and submit
> native JSON. This gives an external API, and also a possible migration
> route
> to moving more functionality into the browser and/or the couchdb server in
> future. Potentially, many of these requests can be proxied straight through
> to couchdb.
>
> So I've been thinking about the impedence mismatch between Rails-style
> applications and couchdb.
>
> An important issue here is what the URL scheme should look like. Couchdb
> has
> a single flat space for all documents; Rails and co. tend to break up
> resources into /resourcetype/id (where 'resourcetype' really is a
> controller
> class). For HTML interaction we need the usual bundle of
>
>  Collection actions:  list, new, create
>  Member actions:      show, edit, update, destroy
>
> So if you treat couchdb as a flat pool of documents, you might end up with
>
>   /876384763284718      having {"type":"Post"}      -- show post
>   /129837823746876      having {"type":"Comment"}   -- show comment
>   /_new?type=Post
>   /_new?type=Comment
>
> This means that the application layer has to fetch the named document,
> *then* analyse what type it has, before dispatching to the appropriate
> controller/view logic.
>
> An alternative I have been exploring is to combine the class and the ID
> into
> the doc_id, separated by space (e.g. "Post 1234", "Comment 9999"). These
> could be either user-assigned meaningful IDs, or random uuids; it doesn't
> really make much difference. So you might have:
>
>  GET /Post%201234               -- show
>  GET /Post%201234/_edit
>  PUT /Post%201234
>  DELETE /Post%201234
>  GET /Post                      -- index collection
>  GET /Post/_new
>  POST /Post/_create
>
> (Aside: this implies that attachment names cannot start with underscore, as
> I'm using them for member actions like _edit. Also, I'd be interested to
> know what URLs couchapps like sofa use for these actions)
>
> Some useful side effects of this are:
> - browsing in Futon becomes a lot more useful;
> - belongs-to links can display useful information (comment belongs-to
>  Post 1234) without necessarily having to retrieve the other document.
>
> Anyway, I have started writing some Ruby code to try out this pattern:
>
>    http://pobox.com/~b.candler/software/couchdb/miniblog.rb<http://pobox.com/%7Eb.candler/software/couchdb/miniblog.rb>
>
> [WARNING: extremely rough code!! You also need to install
> public/javascripts/jquery.js if you want to be able to add/remove tags]
>
> This was an interesting learning experience. For example, I found that even
> the 'bare metal' couchrest was too thick: it didn't let me pass on raw
> JSON.
> So I just used sinatra and rest-client directly.
>
> I found I missed certain helpers, such as Javascript ones. But I think it
> shows that this sort of 'thin middleware' approach could be workable,
> perhaps still using Rails for its richer view generation capabilities.
>
> I didn't (yet) miss having true "model" classes, although I have not
> started
> to work on doing server-side validation.
>
> Some of the actions turned out to be extremely simple though: e.g.
>
>  get '/Post' do
>    # TODO: pagination
>    @posts = Db.get_docs("_all_docs", :startkey=>"Post ", :endkey=>"Post!")
>    haml :post_index
>  end
>
> (4) Error handling: for Javascript user interfaces, if they are going to
> rely on the JSON response for failed 'model' validation, then the JSON
> error
> format needs to be carefully described and sufficiently detailled that the
> front-end can display meaningful messages to the user, e.g. which
> particular
> fields are invalid and why. This would apply for Javascript front-ends
> talking to Rails-like middleware, and also talking directly to couchdb
> should the validation end up in there (as it seems is soon going to be
> possible).
>
> This is something that it would be helpful to standardise, as it may make
> it
> easier to move the validation into couchdb later. That is, I'm quite happy
> with the idea of couchdb providing the 'model' part of MVC which 'models'
> normally provide in Rails.
>
> (5) Without a schema and without migrations, apps are going to have to be
> more robust against coming across member data of the 'wrong' type (e.g.
> expecting post.tags to be an array, and finding a string). In an ORM,
> perhaps the expected type could be declared, and either ignoring it or
> making an automatic cast if the data is of the wrong type.
>
> Map-reduce functions are also going to have to be robust to data of the
> 'wrong' type being present.
>
> I discovered this when I found my post/tags structure was setting
> "tags":"foo" instead of "tags":["foo"] when a single tag was entered in the
> web page - this was an issue with sinatra, see
>
> http://groups.google.com/group/sinatrarb/browse_thread/thread/4790a956bb3242eb
>
> (6) I want to "think" in couchdb and to write my application in terms of
> couchdb native structures, like map/reduce views which I have hand-designed
> to work how I want, rather than use an abstraction layer which forces its
> own particular way of working.
>
> The abstraction layers I have looked at so far don't allow me to work how I
> want - such as embedding the class and ID into the doc_id as described
> above, or handling has-many collections using specially collated views
> where
> the parent is immediately followed by all its children.
>
> Oh well, that's just a bit of a brain dump. Maybe looking through the code
> I
> linked to above will make more sense of it, at least for Ruby people.
>
> Regards,
>
> Brian.
>

Re: Database Disruption From The Couch

Posted by Brian Candler <B....@pobox.com>.
On Tue, Feb 10, 2009 at 04:26:21PM -0500, Dean Landolt wrote:
> > Again, I expect here that a single POST will only be able to update a
> > single
> > document. But the Rails-type way of doing nested parameters maps well to
> > JSON:
> >
> >   foo[bar]=hello       =>   {"foo":{"bar":"hello"}}
> >
> >   foo[bar][]=hello&foo[bar][]=world
> >
> >                        =>   {"foo":{"bar":["hello","world"]}}
> 
> 
> One doc update per POST seems pretty like a reasonable design constraint for
> now -- if you want more, you can always go server side. But that's a pretty
> sexy solution to form serialization I hadn't even considered. Is there some
> way to also differentiate numerics? If so, that's *perfect*.

Unfortunately not - in Rails you just get strings, and it's up to you to map
to numerics if you need them (or let ActiveRecord do that). Perhaps you
could add a some special symbol to the names, like foo#

There's also no provision for empty array or explicit null - but in couchdb,
the absence of an attribute is as good as that.

Re: Database Disruption From The Couch

Posted by Dean Landolt <de...@deanlandolt.com>.
On Tue, Feb 10, 2009 at 4:09 PM, Brian Candler <B....@pobox.com> wrote:

> On Mon, Feb 09, 2009 at 11:30:30AM -0800, Chris Anderson wrote:
> > > (1) Personally, I'm not yet ready to move all view generation into the
> > > browser (Futon-like), i.e. where Javascript fetches the JSON, reformats
> as
> > > HTML, and submits back as JSON. In any case, supporting browsers
> without
> > > Javascript is still a useful capability.
> >
> > The _show and _list features give you the capability to serve HTML or
> > other content types directly based on either doc or view queries. They
> > are a little lacking in documentation, but the test suite should be
> > enough to get your started.
>
> Thanks. If I understand correctly, _show only acts on a single document? In
> practice this may be less of a problem with couchdb than with a SQL-backed
> system (where a single page often combines multiple models), but I can
> still
> see cases where I want to combine a document with some related summary
> information from a view. Maybe this could be done with iframes.


Or you could just use a simple ajax call to pull in any additional data. I
can understand if you're looking to build the _show templates unobtrusively
by merging more than one doc or even views on the server (and I wish there
were a way to do that), but what's more *obtrusive *than iframes? ;)


> Similarly, AFAICT _list is also basic: a header, N rows from a single view,
> a footer.
>
> > If it doesn't work already, it'd be trivial to teach Couch to
> > understand norm HTML form POSTs, with some bare-bones conversion to a
> > JSON document (eg: each field is treated as a string, in a flat
> > namespace)
>
> Again, I expect here that a single POST will only be able to update a
> single
> document. But the Rails-type way of doing nested parameters maps well to
> JSON:
>
>   foo[bar]=hello       =>   {"foo":{"bar":"hello"}}
>
>   foo[bar][]=hello&foo[bar][]=world
>
>                        =>   {"foo":{"bar":["hello","world"]}}


One doc update per POST seems pretty like a reasonable design constraint for
now -- if you want more, you can always go server side. But that's a pretty
sexy solution to form serialization I hadn't even considered. Is there some
way to also differentiate numerics? If so, that's *perfect*.

Re: Database Disruption From The Couch

Posted by Brian Candler <B....@pobox.com>.
On Mon, Feb 09, 2009 at 11:30:30AM -0800, Chris Anderson wrote:
> > (1) Personally, I'm not yet ready to move all view generation into the
> > browser (Futon-like), i.e. where Javascript fetches the JSON, reformats as
> > HTML, and submits back as JSON. In any case, supporting browsers without
> > Javascript is still a useful capability.
> 
> The _show and _list features give you the capability to serve HTML or
> other content types directly based on either doc or view queries. They
> are a little lacking in documentation, but the test suite should be
> enough to get your started.

Thanks. If I understand correctly, _show only acts on a single document? In
practice this may be less of a problem with couchdb than with a SQL-backed
system (where a single page often combines multiple models), but I can still
see cases where I want to combine a document with some related summary
information from a view. Maybe this could be done with iframes.

Similarly, AFAICT _list is also basic: a header, N rows from a single view,
a footer.

> If it doesn't work already, it'd be trivial to teach Couch to
> understand norm HTML form POSTs, with some bare-bones conversion to a
> JSON document (eg: each field is treated as a string, in a flat
> namespace)

Again, I expect here that a single POST will only be able to update a single
document. But the Rails-type way of doing nested parameters maps well to
JSON:

   foo[bar]=hello       =>   {"foo":{"bar":"hello"}}

   foo[bar][]=hello&foo[bar][]=world

                        =>   {"foo":{"bar":["hello","world"]}}

> > (4) Error handling: for Javascript user interfaces, if they are going to
> > rely on the JSON response for failed 'model' validation, then the JSON error
> > format needs to be carefully described and sufficiently detailled that the
> > front-end can display meaningful messages to the user, e.g. which particular
> > fields are invalid and why.
> 
> Agreed - I'm working on some validation helpers for Couchapp and my
> Sofa blog. Part of what's up in the air here is where to draw the line
> between the "standard library" of helpers included with CouchDB, and
> what should be maintained in its own project.

Yes. I looked at Sofa's validation, and it basically just throws an error on
the first erroneous field it finds:

  if (type == 'post') {
    // post required fields
    require(author, "Posts must have an author.")
    require(newDoc.body, "Posts must have a body field")
    require(newDoc.html, "Posts must have an html field.");
    ... etc

    => {"forbidden":"Posts must have an author."}   etc

If there could be library support for building an errors structure which
maps individual tags in the source object to errors for that field, that
would be useful - not least because a well-defined structure could then be
interpreted by a corresponding library at the client side.

Cheers,

Brian.

Re: Database Disruption From The Couch

Posted by Chris Anderson <jc...@apache.org>.
On Sun, Feb 8, 2009 at 1:25 PM, Brian Candler <B....@pobox.com> wrote:
>
> (1) Personally, I'm not yet ready to move all view generation into the
> browser (Futon-like), i.e. where Javascript fetches the JSON, reformats as
> HTML, and submits back as JSON. In any case, supporting browsers without
> Javascript is still a useful capability.

The _show and _list features give you the capability to serve HTML or
other content types directly based on either doc or view queries. They
are a little lacking in documentation, but the test suite should be
enough to get your started.

>
> So for now at least, I still need a layer which will build HTML from a
> document, and allow document create/update via form POST.
>

If it doesn't work already, it'd be trivial to teach Couch to
understand norm HTML form POSTs, with some bare-bones conversion to a
JSON document (eg: each field is treated as a string, in a flat
namespace)


> (2) I need an application logic layer, which enforces business rules.
>

Validation functions can do a lot of this work. One thing that a
pure-Couch solution will discourage you from doing, that is considered
normal in Rails-style apps, is have one action modify or query lots of
documents or views. This is good for latency and cacheability, but may
make some applications harder to build.

> Put another way: document *storage/replication* and application *processing*
> of documents can (and in many cases probably should) remain separate.
> Couchapps remain an interesting idea however.

Yes. It really depends on your development model. One thing Couchapps
have that it's hard to get any other way is the extreme portability.


>
> This means that the application layer has to fetch the named document,
> *then* analyse what type it has, before dispatching to the appropriate
> controller/view logic.

The way _show and _list handle this is by having named functions, so
you'd have a url like

/db/_show/myapp/posts/post-id

or for authors

/db/_show/myapp/authors/author-id

of course asking for an error by calling something like
/db/_show/myapp/authors/post-id is a problem that Rails avoids due to
the Class / Table mapping.


>
> (4) Error handling: for Javascript user interfaces, if they are going to
> rely on the JSON response for failed 'model' validation, then the JSON error
> format needs to be carefully described and sufficiently detailled that the
> front-end can display meaningful messages to the user, e.g. which particular
> fields are invalid and why.

Agreed - I'm working on some validation helpers for Couchapp and my
Sofa blog. Part of what's up in the air here is where to draw the line
between the "standard library" of helpers included with CouchDB, and
what should be maintained in its own project.

> Map-reduce functions are also going to have to be robust to data of the
> 'wrong' type being present.

I always check my fields before using them.

if (doc.foo && doc.foo.bar ... etc)

Good questions. I hope some of what I've written helps.


-- 
Chris Anderson
http://jchris.mfdz.com

Re: Database Disruption From The Couch

Posted by Brian Candler <B....@pobox.com>.
On Fri, Feb 06, 2009 at 10:10:29AM +0100, Andrea Schiavini wrote:
> Daniel, I've read your post, and it is definitely interesting. However I
> don't agree with you, when you say that CouchDB makes Rails obsolete. Rails
> is not only ActiveRecord, it's a model-view-controller framework that helps
> hugely during the development of big applications. The target of Rails is
> not to merely interact with the db, but to provide a good framework to
> handle a complex application that *also* interacts with a DB.

I have been working through some ideas related to this over the weekend and
playing with some code. I have a few points which I'd like to raise for
discussion.

(1) Personally, I'm not yet ready to move all view generation into the
browser (Futon-like), i.e. where Javascript fetches the JSON, reformats as
HTML, and submits back as JSON. In any case, supporting browsers without
Javascript is still a useful capability.

So for now at least, I still need a layer which will build HTML from a
document, and allow document create/update via form POST.

(2) I need an application logic layer, which enforces business rules.

Whilst couchdb may be gaining some features which could be used for 1 and 2
(_show and _action?) I don't think that Couchdb is necessarily the right
place to put this logic. In particular, I don't want to write lots of
application logic in Javascript, especially without a decent testing
framework which is independent of any browser and of Java.

Put another way: document *storage/replication* and application *processing*
of documents can (and in many cases probably should) remain separate.
Couchapps remain an interesting idea however.

(3) I do like the idea of non-HTML clients being able to read and submit
native JSON. This gives an external API, and also a possible migration route
to moving more functionality into the browser and/or the couchdb server in
future. Potentially, many of these requests can be proxied straight through
to couchdb.

So I've been thinking about the impedence mismatch between Rails-style
applications and couchdb.

An important issue here is what the URL scheme should look like. Couchdb has
a single flat space for all documents; Rails and co. tend to break up
resources into /resourcetype/id (where 'resourcetype' really is a controller
class). For HTML interaction we need the usual bundle of

  Collection actions:  list, new, create
  Member actions:      show, edit, update, destroy

So if you treat couchdb as a flat pool of documents, you might end up with

   /876384763284718      having {"type":"Post"}      -- show post
   /129837823746876      having {"type":"Comment"}   -- show comment
   /_new?type=Post
   /_new?type=Comment

This means that the application layer has to fetch the named document,
*then* analyse what type it has, before dispatching to the appropriate
controller/view logic.

An alternative I have been exploring is to combine the class and the ID into
the doc_id, separated by space (e.g. "Post 1234", "Comment 9999"). These
could be either user-assigned meaningful IDs, or random uuids; it doesn't
really make much difference. So you might have:

  GET /Post%201234               -- show
  GET /Post%201234/_edit
  PUT /Post%201234
  DELETE /Post%201234
  GET /Post                      -- index collection
  GET /Post/_new
  POST /Post/_create

(Aside: this implies that attachment names cannot start with underscore, as
I'm using them for member actions like _edit. Also, I'd be interested to
know what URLs couchapps like sofa use for these actions)

Some useful side effects of this are:
- browsing in Futon becomes a lot more useful;
- belongs-to links can display useful information (comment belongs-to
  Post 1234) without necessarily having to retrieve the other document.

Anyway, I have started writing some Ruby code to try out this pattern:

    http://pobox.com/~b.candler/software/couchdb/miniblog.rb

[WARNING: extremely rough code!! You also need to install
public/javascripts/jquery.js if you want to be able to add/remove tags]

This was an interesting learning experience. For example, I found that even
the 'bare metal' couchrest was too thick: it didn't let me pass on raw JSON.
So I just used sinatra and rest-client directly.

I found I missed certain helpers, such as Javascript ones. But I think it
shows that this sort of 'thin middleware' approach could be workable,
perhaps still using Rails for its richer view generation capabilities.

I didn't (yet) miss having true "model" classes, although I have not started
to work on doing server-side validation.

Some of the actions turned out to be extremely simple though: e.g.

  get '/Post' do
    # TODO: pagination
    @posts = Db.get_docs("_all_docs", :startkey=>"Post ", :endkey=>"Post!")
    haml :post_index
  end

(4) Error handling: for Javascript user interfaces, if they are going to
rely on the JSON response for failed 'model' validation, then the JSON error
format needs to be carefully described and sufficiently detailled that the
front-end can display meaningful messages to the user, e.g. which particular
fields are invalid and why. This would apply for Javascript front-ends
talking to Rails-like middleware, and also talking directly to couchdb
should the validation end up in there (as it seems is soon going to be
possible).

This is something that it would be helpful to standardise, as it may make it
easier to move the validation into couchdb later. That is, I'm quite happy
with the idea of couchdb providing the 'model' part of MVC which 'models'
normally provide in Rails.

(5) Without a schema and without migrations, apps are going to have to be
more robust against coming across member data of the 'wrong' type (e.g.
expecting post.tags to be an array, and finding a string). In an ORM,
perhaps the expected type could be declared, and either ignoring it or
making an automatic cast if the data is of the wrong type.

Map-reduce functions are also going to have to be robust to data of the
'wrong' type being present.

I discovered this when I found my post/tags structure was setting
"tags":"foo" instead of "tags":["foo"] when a single tag was entered in the
web page - this was an issue with sinatra, see
http://groups.google.com/group/sinatrarb/browse_thread/thread/4790a956bb3242eb

(6) I want to "think" in couchdb and to write my application in terms of
couchdb native structures, like map/reduce views which I have hand-designed
to work how I want, rather than use an abstraction layer which forces its
own particular way of working.

The abstraction layers I have looked at so far don't allow me to work how I
want - such as embedding the class and ID into the doc_id as described
above, or handling has-many collections using specially collated views where
the parent is immediately followed by all its children.

Oh well, that's just a bit of a brain dump. Maybe looking through the code I
linked to above will make more sense of it, at least for Ruby people.

Regards,

Brian.

Re: Database Disruption From The Couch

Posted by Andrea Schiavini <a....@sourcesense.com>.
Daniel, I've read your post, and it is definitely interesting. However I
don't agree with you, when you say that CouchDB makes Rails obsolete. Rails
is not only ActiveRecord, it's a model-view-controller framework that helps
hugely during the development of big applications. The target of Rails is
not to merely interact with the db, but to provide a good framework to
handle a complex application that *also* interacts with a DB. The logical
separation into model, views and controllers is vital. Also, Rails
simplifies the handling of routes, environments and so on. I've been
developing an application in the last months, built over Rails, but using
CouchDB. Even if I excluded ActiveModel, ActiveRecord and so on, Rails has
been really helpful to me.

Andrea Schiavini

2009/2/5 Daniel T <da...@gmail.com>

> This is a blog entry I wrote about CouchDB about a month ago... been
> meaning
> to post it here
>
> http://danieltsadok.wordpress.com/2008/12/22/database-disruption-from-the-couch/
>
> :-Daniel Tsadok
>