You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Yuriy Sazonets <yu...@sazonets.com> on 2008/10/26 20:20:48 UTC

Error On Getting All Documents From The Databases

Hi,

Not sure if this is a bug or I'm doing something wrong.

I have two databases for testing called 'dev_offer' and 'dev_script'.
I have inserted few (1 or 2) documents to the each of databases and
after system crash I can't retrieve _all view for any of these
databases:

>curl http://127.0.0.1:5984/dev_script/_all/                                           ~
{"error":"EXIT","reason":"{{{badmatch,eof},\n
[{couch_file,handle_call,3},\n   {gen_server,handle_msg,5},\n
{proc_lib,init_p,5}]},\n
{gen_server,call,[<0.58.0>,{pread_bin,14207},infinity]}}"}

Log:

[Sun, 26 Oct 2008 19:17:13 GMT] [error] [<0.62.0>] ** Generic server
<0.62.0> terminating
** Last message in was {pread_bin,14207}
** When Server state == {file_descriptor,prim_file,{#Port<0.142>,11}}
** Reason for termination ==
** {{badmatch,eof},
    [{couch_file,handle_call,3},
     {gen_server,handle_msg,5},
     {proc_lib,init_p,5}]}


[Sun, 26 Oct 2008 19:17:13 GMT] [error] [<0.62.0>] {error_report,<0.22.0>,
    {<0.62.0>,crash_report,
     [[{pid,<0.62.0>},
       {registered_name,[]},
       {error_info,
           {exit,
               {{badmatch,eof},
                [{couch_file,handle_call,3},
                 {gen_server,handle_msg,5},
                 {proc_lib,init_p,5}]},
               [{gen_server,terminate,6},{proc_lib,init_p,5}]}},
       {initial_call,
           {gen,init_it,
               [gen_server,<0.47.0>,<0.47.0>,couch_file,
                {"/usr/local/var/lib/couchdb/dev_script.couch",[],<0.47.0>},
                []]}},
       {ancestors,[couch_server_sup,<0.1.0>]},
       {messages,[]},
       {links,[<0.64.0>]},
       {dictionary,[]},
       {trap_exit,false},
       {status,running},
       {heap_size,377},
       {stack_size,23},
       {reductions,1233}],
      [{neighbour,
           [{pid,<0.65.0>},
            {registered_name,[]},
            {initial_call,
                {gen,init_it,
                    [gen_server,<0.64.0>,<0.64.0>,couch_stream,
                     {{4165,9931},<0.62.0>},
                     []]}},
            {current_function,{gen_server,loop,6}},
            {ancestors,[<0.64.0>]},
            {messages,[]},
            {links,[<0.64.0>]},
            {dictionary,[]},
            {trap_exit,false},
            {status,waiting},
            {heap_size,233},
            {stack_size,12},
            {reductions,37}]},
       {neighbour,
           [{pid,<0.63.0>},
            {registered_name,[]},
            {initial_call,
                {gen,init_it,
                    [gen_server,<0.47.0>,<0.47.0>,couch_db,
                     {"dev_script",
                      "/usr/local/var/lib/couchdb/dev_script.couch",<0.62.0>,
                      []},
                     []]}},
            {current_function,{gen_server,loop,6}},
            {ancestors,[couch_server_sup,<0.1.0>]},
            {messages,[]},
            {links,[<0.47.0>,<0.64.0>]},
            {dictionary,[]},
            {trap_exit,false},
            {status,waiting},
            {heap_size,610},
            {stack_size,12},
            {reductions,93}]},
       {neighbour,
           [{pid,<0.64.0>},
            {registered_name,[]},
            {initial_call,{couch_db,start_update_loop,2}},
            {current_function,{couch_db,update_loop,1}},
            {ancestors,[]},
            {messages,[]},
            {links,[<0.63.0>,<0.65.0>,<0.62.0>]},
            {dictionary,[]},
            {trap_exit,false},
            {status,waiting},
            {heap_size,233},
            {stack_size,10},
            {reductions,106}]}]]}}

[Sun, 26 Oct 2008 19:17:13 GMT] [error] [<0.47.0>] {error_report,<0.22.0>,
    {<0.47.0>,supervisor_report,
     [{supervisor,{local,couch_server_sup}},
      {errorContext,child_terminated},
      {reason,
          {{badmatch,eof},
           [{couch_file,handle_call,3},
            {gen_server,handle_msg,5},
            {proc_lib,init_p,5}]}},
      {offender,
          [{pid,<0.63.0>},
           {name,"dev_script"},
           {mfa,
               {couch_db,open,
                   ["dev_script",
                    "/usr/local/var/lib/couchdb/dev_script.couch"]}},
           {restart_type,transient},
           {shutdown,infinity},
           {child_type,supervisor}]}]}}

[Sun, 26 Oct 2008 19:17:13 GMT] [info] [<0.3347.0>] HTTP Error (code
500): {'EXIT',
                           {{{badmatch,eof},
                             [{couch_file,handle_call,3},
                              {gen_server,handle_msg,5},
                              {proc_lib,init_p,5}]},
                            {gen_server,call,
                                [<0.62.0>,{pread_bin,14207},infinity]}}}

[Sun, 26 Oct 2008 19:17:13 GMT] [info] [<0.3347.0>] 127.0.0.1 - - "GET
/dev_script/_all/" 500

>curl http://127.0.0.1:5984/dev_script/                                                ~
{"db_name":"dev_script","doc_count":1,"doc_del_count":0,"update_seq":1,"compact_running":false,"disk_size":8192}

>curl http://127.0.0.1:5984/                                                           ~
{"couchdb":"Welcome","version":"0.8.1-incubating"}

All tests from the test suite succeed. Running CouchDB 0.8.1 on Mac OS
X 10.5.5 built from the official source package.

-- 
With Respect,
Yuriy.

Re: Error On Getting All Documents From The Databases

Posted by Yuriy Sazonets <yu...@sazonets.com>.
Hi,

On Mon, Oct 27, 2008 at 11:50 AM, Jan Lehnardt <ja...@apache.org> wrote:

> Weird. Has `dev_script` been created with an older version of CouchDB? If
> true,
> you need to recreate it.

No, same version. It's actually the first documents I tried to insert
into fresh CouchDB installation :(

-- 
With Respect,
Yuriy.

Re: Efficient view design question

Posted by Ben Nevile <be...@mainsocial.com>.
Hi Chris - thanks for the update=true.  ;)
For highly-interdependent data sets this feature is really important.  I'll
follow your progress closely, and be happy to test out on my real data
(hundreds of thousands of records, millions of writes a day.)

Ben


On Mon, Oct 27, 2008 at 1:36 PM, Chris Anderson <jc...@apache.org> wrote:

> On Mon, Oct 27, 2008 at 10:51 AM, Ben Nevile <be...@mainsocial.com> wrote:
> > Thanks Jason.  One further clarification, re "If you set the *update*
> option
> > to*false*, CouchDB will not perform any refreshing on the view that may
> be
> > necessary."
> > Does this mean that if I only call the view with update=false, the view
> > index will never be updated?
> >
> > Ben
>
> The update=false behavior is complicated by some implementation
> details. Oh, and the feature is currently not available in trunk, but
> is under active development.
>
> The upshot is that update=false will be useful for reducing latency in
> queries against a database that has a lot of writes going into it.
> However, it's not a guarantee that the no updating will be triggered,
> nor does it guarantee that results will be returned immediately. For
> that reason, the name might be misleading, but let's wait til the
> implementation is done to reconsider the name.
>
> Implementation details: update=false queries the latest available
> version of a completed view index. This means that if the view has not
> been built yet, there will not be an available completed index, and
> the index will be built before running the query. Also, in the case of
> a freshly booted server, there may not be a pointer to the last
> completed index, so we end up running an update then as well. For
> various reasons, partially completed indexes are not viable for
> queries.
>
> There are a lot of people who'd like a progress-bar on view
> computation. It is possible that we'll make available something like
> update=status, which could tell users roughly how many docs have been
> mapped, vs how many docs there are total, which would give an
> approximate measure of progress.
>
> Chris
>
>
>
> --
> Chris Anderson
> http://jchris.mfdz.com
>

Re: Efficient view design question

Posted by Chris Anderson <jc...@apache.org>.
On Mon, Oct 27, 2008 at 10:51 AM, Ben Nevile <be...@mainsocial.com> wrote:
> Thanks Jason.  One further clarification, re "If you set the *update* option
> to*false*, CouchDB will not perform any refreshing on the view that may be
> necessary."
> Does this mean that if I only call the view with update=false, the view
> index will never be updated?
>
> Ben

The update=false behavior is complicated by some implementation
details. Oh, and the feature is currently not available in trunk, but
is under active development.

The upshot is that update=false will be useful for reducing latency in
queries against a database that has a lot of writes going into it.
However, it's not a guarantee that the no updating will be triggered,
nor does it guarantee that results will be returned immediately. For
that reason, the name might be misleading, but let's wait til the
implementation is done to reconsider the name.

Implementation details: update=false queries the latest available
version of a completed view index. This means that if the view has not
been built yet, there will not be an available completed index, and
the index will be built before running the query. Also, in the case of
a freshly booted server, there may not be a pointer to the last
completed index, so we end up running an update then as well. For
various reasons, partially completed indexes are not viable for
queries.

There are a lot of people who'd like a progress-bar on view
computation. It is possible that we'll make available something like
update=status, which could tell users roughly how many docs have been
mapped, vs how many docs there are total, which would give an
approximate measure of progress.

Chris



-- 
Chris Anderson
http://jchris.mfdz.com

Re: Efficient view design question

Posted by Ben Nevile <be...@mainsocial.com>.
Thanks Jason.  One further clarification, re "If you set the *update* option
to*false*, CouchDB will not perform any refreshing on the view that may be
necessary."
Does this mean that if I only call the view with update=false, the view
index will never be updated?

Ben




On Mon, Oct 27, 2008 at 10:13 AM, Jason Davies <ja...@jasondavies.com>wrote:

> Jason Davies wrote:
>
>> Ben Nevile wrote:
>>
>>> Thanks Jan, that article answers a lot of questions.
>>> re the following quote: "There is also an option (under development) to
>>> immediately return a stale copy of the view in case the client can
>>> tolerate
>>> that."
>>>
>>> My application needs to do hundreds of thousands of index updates for
>>> every
>>> incoming piece of data, so being able to return a stale result is
>>> critical. Where can I find some documentation about this option?
>>>
>>>
>>
>> http://wiki.apache.org/couchdb/HttpViewApi#head-45806c14f9b1a9e1a4b0ea579ffdf150077f8cb9
>>
> Sorry, I forgot to point out which option it is!  It's the update=false
> option.
>
> HTH,
>
>
> Jason
>
> --
> Jason Davies
>
> www.jasondavies.com
>
>

Re: Efficient view design question

Posted by Jason Davies <ja...@jasondavies.com>.
Jason Davies wrote:
> Ben Nevile wrote:
>> Thanks Jan, that article answers a lot of questions.
>> re the following quote: "There is also an option (under development) to
>> immediately return a stale copy of the view in case the client can 
>> tolerate
>> that."
>>
>> My application needs to do hundreds of thousands of index updates for 
>> every
>> incoming piece of data, so being able to return a stale result is
>> critical. Where can I find some documentation about this option?
>>   
> http://wiki.apache.org/couchdb/HttpViewApi#head-45806c14f9b1a9e1a4b0ea579ffdf150077f8cb9 
>
Sorry, I forgot to point out which option it is!  It's the update=false 
option.

HTH,

Jason

-- 
Jason Davies

www.jasondavies.com


Re: Efficient view design question

Posted by Jason Davies <ja...@jasondavies.com>.
Ben Nevile wrote:
> Thanks Jan, that article answers a lot of questions.
> re the following quote: "There is also an option (under development) to
> immediately return a stale copy of the view in case the client can tolerate
> that."
>
> My application needs to do hundreds of thousands of index updates for every
> incoming piece of data, so being able to return a stale result is
> critical. Where can I find some documentation about this option?
>   
http://wiki.apache.org/couchdb/HttpViewApi#head-45806c14f9b1a9e1a4b0ea579ffdf150077f8cb9

Jason

-- 
Jason Davies

www.jasondavies.com


Re: Efficient view design question

Posted by Ben Nevile <be...@mainsocial.com>.
Thanks Jan, that article answers a lot of questions.
re the following quote: "There is also an option (under development) to
immediately return a stale copy of the view in case the client can tolerate
that."

My application needs to do hundreds of thousands of index updates for every
incoming piece of data, so being able to return a stale result is
critical. Where can I find some documentation about this option?

Ben



On Mon, Oct 27, 2008 at 9:30 AM, Jan Lehnardt <ja...@apache.org> wrote:

>
> On Oct 27, 2008, at 17:15, Dean Landolt wrote:
>
>  On Mon, Oct 27, 2008 at 12:13 PM, Jan Lehnardt <ja...@apache.org> wrote:
>>
>>
>>> On Oct 27, 2008, at 16:53, Ben Nevile wrote:
>>>
>>>
>>>  First off to alay your main concern, view indexes are not completely
>>>>> regenerated on each update. Its only a diff.
>>>>>
>>>>>
>>>>>  Presumably reduce operations have to operate on the entire set every
>>>> time?
>>>>
>>>>
>>> Nope, reduce results are stored in the btree back nodes.
>>>
>>
>>
>> Or...that.
>>
>> Jan -- can you speak to what that means? Back nodes?
>>
>
> See http://horicky.blogspot.com/2008/10/couchdb-implementation.html
>
> Cheers
> Jan
> --
>

Re: Efficient view design question

Posted by Jan Lehnardt <ja...@apache.org>.
On Oct 27, 2008, at 17:15, Dean Landolt wrote:

> On Mon, Oct 27, 2008 at 12:13 PM, Jan Lehnardt <ja...@apache.org> wrote:
>
>>
>> On Oct 27, 2008, at 16:53, Ben Nevile wrote:
>>
>>
>>>> First off to alay your main concern, view indexes are not  
>>>> completely
>>>> regenerated on each update. Its only a diff.
>>>>
>>>>
>>> Presumably reduce operations have to operate on the entire set  
>>> every time?
>>>
>>
>> Nope, reduce results are stored in the btree back nodes.
>
>
> Or...that.
>
> Jan -- can you speak to what that means? Back nodes?

See http://horicky.blogspot.com/2008/10/couchdb-implementation.html

Cheers
Jan
--

Re: Efficient view design question

Posted by Dean Landolt <de...@deanlandolt.com>.
On Mon, Oct 27, 2008 at 12:13 PM, Jan Lehnardt <ja...@apache.org> wrote:

>
> On Oct 27, 2008, at 16:53, Ben Nevile wrote:
>
>
>>> First off to alay your main concern, view indexes are not completely
>>> regenerated on each update. Its only a diff.
>>>
>>>
>> Presumably reduce operations have to operate on the entire set every time?
>>
>
> Nope, reduce results are stored in the btree back nodes.


Or...that.

Jan -- can you speak to what that means? Back nodes?

Re: Efficient view design question

Posted by Jan Lehnardt <ja...@apache.org>.
On Oct 27, 2008, at 16:53, Ben Nevile wrote:

>>
>> First off to alay your main concern, view indexes are not completely
>> regenerated on each update. Its only a diff.
>>
>
> Presumably reduce operations have to operate on the entire set every  
> time?

Nope, reduce results are stored in the btree back nodes.

Cheers
Jan
--

Re: Efficient view design question

Posted by Dean Landolt <de...@deanlandolt.com>.
On Mon, Oct 27, 2008 at 11:53 AM, Ben Nevile <be...@mainsocial.com> wrote:

> >
> > First off to alay your main concern, view indexes are not completely
> > regenerated on each update. Its only a diff.
> >
>
> Presumably reduce operations have to operate on the entire set every time?
>
> Ben
>

I may be completely off base here, but that's where rereduce comes. I'd done
a little profiling on reduce a while back, and it looks as though it lumps
reductions into manageable chunks of approximately 20 or so emitted map
records. I don't know what kind of magic it does behind the scenes to
determine when to tease apart these chunks, but something's going on where
only affected records get broken apart, otherwise the whole chunk gets
passed through as an object. So yes, a reduce operates on the whole set, but
in very large summary chunks.

Someone correct me if I'm wrong (please!) -- I've been scratching my head
about the internals of reduce for a while now, and still haven't found a
very illuminating description of the process.

Re: Efficient view design question

Posted by Ben Nevile <be...@mainsocial.com>.
>
> First off to alay your main concern, view indexes are not completely
> regenerated on each update. Its only a diff.
>

Presumably reduce operations have to operate on the entire set every time?

Ben

Re: Efficient view design question

Posted by Paul Davis <pa...@gmail.com>.
Jonathan,

That's there too. Same patch even. You can post an array of keys to
any defined or temporary view as well as _all_docs. Not sure if its in
the wiki yet or not.

Note: The post body should include something like

{"keys": ["key1", "key2"]}

And if you're hitting _all_docs, key1... would be document ids.

Paul

On Mon, Oct 27, 2008 at 9:08 AM, Jonathan Moss
<jo...@tangentlabs.co.uk> wrote:
> Paul,
>
> That makes sense :)
>
> As for using the include_docs parameter that is certainly one option. I also
> believe I saw something mentioned a while ago about being able to retrieve
> multiple docs from a single get request by providing a series of Ids. Was
> this just in discussion or does it already exist since I figure if I already
> have the Ids then I do not need to use a view for this?
>
> Thanks,
>
> Jon
>>
>> Jonathan,
>>
>> First off to alay your main concern, view indexes are not completely
>> regenerated on each update. Its only a diff.
>>
>> So, given we have a database with some built view. If a document X
>> changes in the db, the view serer deletes any rows in the view that
>> came from doc X, then runs the map view with the new version of the
>> doc adding back any of the rows.
>>
>> In this method, each time you request a view, its only updating the
>> data that's changed since the last view request.
>>
>> Other than that, as you point out, emitting the entire doc isn't
>> overly efficient. Things to consider are the relative recent addition
>> of the include_docs parameter. Also, there's a wiki page on working
>> with hierarchal data that's got some good ideas.
>>
>> HTH,
>> Paul Davis
>>
>> On Mon, Oct 27, 2008 at 7:20 AM, Jonathan Moss
>> <jo...@tangentlabs.co.uk> wrote:
>>
>>>
>>> Greetings all,
>>>
>>> I am currently writing a set of classes to handle php object model <->
>>> CouchDB. The PHP objects are hierarchical and I have modelled this as
>>> essentially a doubly linked list. So that every document within DouchDB
>>> has
>>> a 'Children' array and a 'Parents' array. These arrays contain the Ids or
>>> related objects.
>>>
>>> I already have a couple of map functions to retrieve children and
>>> parents:
>>>
>>> "childrenOf": {
>>>     "map": "function(doc) {for(var idx in doc.Parents)
>>> {emit(doc.Parents[idx], doc);}}"
>>>  },
>>>  "parentsOf": {
>>>     "map": "function(doc) {for(var idx in doc.Children)
>>> {emit(doc.Children[idx], doc);}}"
>>>  }
>>>
>>> These functions return whole documents. My understanding of views is that
>>> these views would have to be re-generated every time a document is added,
>>> removed or updated. If this is the case then when the number of documents
>>> in
>>> the database starts getting larger, the initial response time to retrieve
>>> one of these views would become considerable. In a small, system where
>>> writes are un-common and reads regular. This would not be an issue.
>>> However,
>>> I am struggling to find more than a handful of niche applications were
>>> this
>>> would be true.  In almost all web application I have written, almost
>>> every
>>> request to the website will result in something (even if it is just
>>> tracking
>>> data) being written to the database. On a high volume website this would
>>> result in views having to be re-created almost constantly. Therefore
>>> efficient view design becomes paramount.
>>>
>>> The view functions shown above return the whole doc. Which is know is
>>> in-efficient. In fact since I already have the document I want the
>>> children/parents of, I also already have all the child/parent IDs. Would
>>> it
>>> be much more efficient to simply retrieve the parent/child documents
>>> individually rather than having to re-generate views all the time?
>>>
>>> As a side question - Having to re-generate views constantly in this kind
>>> of
>>> a situation could prove a real issue. I know that CouchDB is still
>>> pre-1.0
>>> release and the developers are necessarily focusing on 'getting is right'
>>> before 'getting it fast' (to coin a phrase :) but will improvements in
>>> speed
>>> already on the roadmap make these worries moot except in very large
>>> databases or is it always going to be an issue and therefore require some
>>> clever application design?
>>> e.g. keeping frequently updated data in a traditional SQL DB and only
>>> keep
>>> rarely updated data in CouchDB, which would be a shame.
>>>
>>> Thanks,
>>> Jon
>>>
>>>
>>
>>
>>
>
>

Re: Efficient view design question

Posted by Jonathan Moss <jo...@tangentlabs.co.uk>.
Paul,

That makes sense :)

As for using the include_docs parameter that is certainly one option. I 
also believe I saw something mentioned a while ago about being able to 
retrieve multiple docs from a single get request by providing a series 
of Ids. Was this just in discussion or does it already exist since I 
figure if I already have the Ids then I do not need to use a view for this?

Thanks,

Jon
> Jonathan,
>
> First off to alay your main concern, view indexes are not completely
> regenerated on each update. Its only a diff.
>
> So, given we have a database with some built view. If a document X
> changes in the db, the view serer deletes any rows in the view that
> came from doc X, then runs the map view with the new version of the
> doc adding back any of the rows.
>
> In this method, each time you request a view, its only updating the
> data that's changed since the last view request.
>
> Other than that, as you point out, emitting the entire doc isn't
> overly efficient. Things to consider are the relative recent addition
> of the include_docs parameter. Also, there's a wiki page on working
> with hierarchal data that's got some good ideas.
>
> HTH,
> Paul Davis
>
> On Mon, Oct 27, 2008 at 7:20 AM, Jonathan Moss
> <jo...@tangentlabs.co.uk> wrote:
>   
>> Greetings all,
>>
>> I am currently writing a set of classes to handle php object model <->
>> CouchDB. The PHP objects are hierarchical and I have modelled this as
>> essentially a doubly linked list. So that every document within DouchDB has
>> a 'Children' array and a 'Parents' array. These arrays contain the Ids or
>> related objects.
>>
>> I already have a couple of map functions to retrieve children and parents:
>>
>> "childrenOf": {
>>      "map": "function(doc) {for(var idx in doc.Parents)
>> {emit(doc.Parents[idx], doc);}}"
>>  },
>>  "parentsOf": {
>>      "map": "function(doc) {for(var idx in doc.Children)
>> {emit(doc.Children[idx], doc);}}"
>>  }
>>
>> These functions return whole documents. My understanding of views is that
>> these views would have to be re-generated every time a document is added,
>> removed or updated. If this is the case then when the number of documents in
>> the database starts getting larger, the initial response time to retrieve
>> one of these views would become considerable. In a small, system where
>> writes are un-common and reads regular. This would not be an issue. However,
>> I am struggling to find more than a handful of niche applications were this
>> would be true.  In almost all web application I have written, almost every
>> request to the website will result in something (even if it is just tracking
>> data) being written to the database. On a high volume website this would
>> result in views having to be re-created almost constantly. Therefore
>> efficient view design becomes paramount.
>>
>> The view functions shown above return the whole doc. Which is know is
>> in-efficient. In fact since I already have the document I want the
>> children/parents of, I also already have all the child/parent IDs. Would it
>> be much more efficient to simply retrieve the parent/child documents
>> individually rather than having to re-generate views all the time?
>>
>> As a side question - Having to re-generate views constantly in this kind of
>> a situation could prove a real issue. I know that CouchDB is still pre-1.0
>> release and the developers are necessarily focusing on 'getting is right'
>> before 'getting it fast' (to coin a phrase :) but will improvements in speed
>> already on the roadmap make these worries moot except in very large
>> databases or is it always going to be an issue and therefore require some
>> clever application design?
>> e.g. keeping frequently updated data in a traditional SQL DB and only keep
>> rarely updated data in CouchDB, which would be a shame.
>>
>> Thanks,
>> Jon
>>
>>     
>
>
>   


Re: Efficient view design question

Posted by Paul Davis <pa...@gmail.com>.
Jonathan,

First off to alay your main concern, view indexes are not completely
regenerated on each update. Its only a diff.

So, given we have a database with some built view. If a document X
changes in the db, the view serer deletes any rows in the view that
came from doc X, then runs the map view with the new version of the
doc adding back any of the rows.

In this method, each time you request a view, its only updating the
data that's changed since the last view request.

Other than that, as you point out, emitting the entire doc isn't
overly efficient. Things to consider are the relative recent addition
of the include_docs parameter. Also, there's a wiki page on working
with hierarchal data that's got some good ideas.

HTH,
Paul Davis

On Mon, Oct 27, 2008 at 7:20 AM, Jonathan Moss
<jo...@tangentlabs.co.uk> wrote:
> Greetings all,
>
> I am currently writing a set of classes to handle php object model <->
> CouchDB. The PHP objects are hierarchical and I have modelled this as
> essentially a doubly linked list. So that every document within DouchDB has
> a 'Children' array and a 'Parents' array. These arrays contain the Ids or
> related objects.
>
> I already have a couple of map functions to retrieve children and parents:
>
> "childrenOf": {
>      "map": "function(doc) {for(var idx in doc.Parents)
> {emit(doc.Parents[idx], doc);}}"
>  },
>  "parentsOf": {
>      "map": "function(doc) {for(var idx in doc.Children)
> {emit(doc.Children[idx], doc);}}"
>  }
>
> These functions return whole documents. My understanding of views is that
> these views would have to be re-generated every time a document is added,
> removed or updated. If this is the case then when the number of documents in
> the database starts getting larger, the initial response time to retrieve
> one of these views would become considerable. In a small, system where
> writes are un-common and reads regular. This would not be an issue. However,
> I am struggling to find more than a handful of niche applications were this
> would be true.  In almost all web application I have written, almost every
> request to the website will result in something (even if it is just tracking
> data) being written to the database. On a high volume website this would
> result in views having to be re-created almost constantly. Therefore
> efficient view design becomes paramount.
>
> The view functions shown above return the whole doc. Which is know is
> in-efficient. In fact since I already have the document I want the
> children/parents of, I also already have all the child/parent IDs. Would it
> be much more efficient to simply retrieve the parent/child documents
> individually rather than having to re-generate views all the time?
>
> As a side question - Having to re-generate views constantly in this kind of
> a situation could prove a real issue. I know that CouchDB is still pre-1.0
> release and the developers are necessarily focusing on 'getting is right'
> before 'getting it fast' (to coin a phrase :) but will improvements in speed
> already on the roadmap make these worries moot except in very large
> databases or is it always going to be an issue and therefore require some
> clever application design?
> e.g. keeping frequently updated data in a traditional SQL DB and only keep
> rarely updated data in CouchDB, which would be a shame.
>
> Thanks,
> Jon
>

Re: Efficient view design question

Posted by Jan Lehnardt <ja...@apache.org>.
On Oct 27, 2008, at 12:20, Jonathan Moss wrote:

> The view functions shown above return the whole doc. Which is know  
> is in-efficient. In fact since I already have the document I want  
> the children/parents of, I also already have all the child/parent  
> IDs. Would it be much more efficient to simply retrieve the parent/ 
> child documents individually rather than having to re-generate views  
> all the time?

See 2) at http://wiki.apache.org/couchdb/FrequentlyAskedQuestions#slow_view_building


> As a side question - Having to re-generate views constantly in this  
> kind of a situation could prove a real issue. I know that CouchDB is  
> still pre-1.0 release and the developers are necessarily focusing on  
> 'getting is right' before 'getting it fast' (to coin a phrase :) but  
> will improvements in speed already on the roadmap make these worries  
> moot except in very large databases or is it always going to be an  
> issue and therefore require some clever application design?
> e.g. keeping frequently updated data in a traditional SQL DB and  
> only keep rarely updated data in CouchDB, which would be a shame.

We'll work on improving view creation time. At the moment it is all  
done sequentially. The MapReduce design will allow us to parallelize  
operations which will make better use of your multi-core servers.

Cheers
Jan
--

Efficient view design question

Posted by Jonathan Moss <jo...@tangentlabs.co.uk>.
Greetings all,

I am currently writing a set of classes to handle php object model <-> 
CouchDB. The PHP objects are hierarchical and I have modelled this as 
essentially a doubly linked list. So that every document within DouchDB 
has a 'Children' array and a 'Parents' array. These arrays contain the 
Ids or related objects.

I already have a couple of map functions to retrieve children and parents:

"childrenOf": {
       "map": "function(doc) {for(var idx in doc.Parents) 
{emit(doc.Parents[idx], doc);}}"
   },
   "parentsOf": {
       "map": "function(doc) {for(var idx in doc.Children) 
{emit(doc.Children[idx], doc);}}"
   }

These functions return whole documents. My understanding of views is 
that these views would have to be re-generated every time a document is 
added, removed or updated. If this is the case then when the number of 
documents in the database starts getting larger, the initial response 
time to retrieve one of these views would become considerable. In a 
small, system where writes are un-common and reads regular. This would 
not be an issue. However, I am struggling to find more than a handful of 
niche applications were this would be true.  In almost all web 
application I have written, almost every request to the website will 
result in something (even if it is just tracking data) being written to 
the database. On a high volume website this would result in views having 
to be re-created almost constantly. Therefore efficient view design 
becomes paramount.

The view functions shown above return the whole doc. Which is know is 
in-efficient. In fact since I already have the document I want the 
children/parents of, I also already have all the child/parent IDs. Would 
it be much more efficient to simply retrieve the parent/child documents 
individually rather than having to re-generate views all the time?

As a side question - Having to re-generate views constantly in this kind 
of a situation could prove a real issue. I know that CouchDB is still 
pre-1.0 release and the developers are necessarily focusing on 'getting 
is right' before 'getting it fast' (to coin a phrase :) but will 
improvements in speed already on the roadmap make these worries moot 
except in very large databases or is it always going to be an issue and 
therefore require some clever application design?
e.g. keeping frequently updated data in a traditional SQL DB and only 
keep rarely updated data in CouchDB, which would be a shame.

Thanks,
Jon

Re: Error On Getting All Documents From The Databases

Posted by Jan Lehnardt <ja...@apache.org>.
On Oct 27, 2008, at 06:32, Chris Anderson wrote:

> Yuriy,
>
> Try http://127.0.0.1:5984/dev_script/_all_docs
>

Weird. Has `dev_script` been created with an older version of CouchDB?  
If true,
you need to recreate it.

Cheers
Jan
--


> Hope that helps,
>
> Chris
>
> On Sun, Oct 26, 2008 at 12:20 PM, Yuriy Sazonets  
> <yu...@sazonets.com> wrote:
>> Hi,
>>
>> Not sure if this is a bug or I'm doing something wrong.
>>
>> I have two databases for testing called 'dev_offer' and 'dev_script'.
>> I have inserted few (1 or 2) documents to the each of databases and
>> after system crash I can't retrieve _all view for any of these
>> databases:
>>
>>> curl http://127.0.0.1:5984/dev_script/ 
>>> _all/                                           ~
>> {"error":"EXIT","reason":"{{{badmatch,eof},\n
>> [{couch_file,handle_call,3},\n   {gen_server,handle_msg,5},\n
>> {proc_lib,init_p,5}]},\n
>> {gen_server,call,[<0.58.0>,{pread_bin,14207},infinity]}}"}
>>
>> Log:
>>
>> [Sun, 26 Oct 2008 19:17:13 GMT] [error] [<0.62.0>] ** Generic server
>> <0.62.0> terminating
>> ** Last message in was {pread_bin,14207}
>> ** When Server state == {file_descriptor,prim_file,{#Port<0.142>,11}}
>> ** Reason for termination ==
>> ** {{badmatch,eof},
>>   [{couch_file,handle_call,3},
>>    {gen_server,handle_msg,5},
>>    {proc_lib,init_p,5}]}
>>
>>
>> [Sun, 26 Oct 2008 19:17:13 GMT] [error] [<0.62.0>]  
>> {error_report,<0.22.0>,
>>   {<0.62.0>,crash_report,
>>    [[{pid,<0.62.0>},
>>      {registered_name,[]},
>>      {error_info,
>>          {exit,
>>              {{badmatch,eof},
>>               [{couch_file,handle_call,3},
>>                {gen_server,handle_msg,5},
>>                {proc_lib,init_p,5}]},
>>              [{gen_server,terminate,6},{proc_lib,init_p,5}]}},
>>      {initial_call,
>>          {gen,init_it,
>>              [gen_server,<0.47.0>,<0.47.0>,couch_file,
>>               {"/usr/local/var/lib/couchdb/dev_script.couch", 
>> [],<0.47.0>},
>>               []]}},
>>      {ancestors,[couch_server_sup,<0.1.0>]},
>>      {messages,[]},
>>      {links,[<0.64.0>]},
>>      {dictionary,[]},
>>      {trap_exit,false},
>>      {status,running},
>>      {heap_size,377},
>>      {stack_size,23},
>>      {reductions,1233}],
>>     [{neighbour,
>>          [{pid,<0.65.0>},
>>           {registered_name,[]},
>>           {initial_call,
>>               {gen,init_it,
>>                   [gen_server,<0.64.0>,<0.64.0>,couch_stream,
>>                    {{4165,9931},<0.62.0>},
>>                    []]}},
>>           {current_function,{gen_server,loop,6}},
>>           {ancestors,[<0.64.0>]},
>>           {messages,[]},
>>           {links,[<0.64.0>]},
>>           {dictionary,[]},
>>           {trap_exit,false},
>>           {status,waiting},
>>           {heap_size,233},
>>           {stack_size,12},
>>           {reductions,37}]},
>>      {neighbour,
>>          [{pid,<0.63.0>},
>>           {registered_name,[]},
>>           {initial_call,
>>               {gen,init_it,
>>                   [gen_server,<0.47.0>,<0.47.0>,couch_db,
>>                    {"dev_script",
>>                     "/usr/local/var/lib/couchdb/ 
>> dev_script.couch",<0.62.0>,
>>                     []},
>>                    []]}},
>>           {current_function,{gen_server,loop,6}},
>>           {ancestors,[couch_server_sup,<0.1.0>]},
>>           {messages,[]},
>>           {links,[<0.47.0>,<0.64.0>]},
>>           {dictionary,[]},
>>           {trap_exit,false},
>>           {status,waiting},
>>           {heap_size,610},
>>           {stack_size,12},
>>           {reductions,93}]},
>>      {neighbour,
>>          [{pid,<0.64.0>},
>>           {registered_name,[]},
>>           {initial_call,{couch_db,start_update_loop,2}},
>>           {current_function,{couch_db,update_loop,1}},
>>           {ancestors,[]},
>>           {messages,[]},
>>           {links,[<0.63.0>,<0.65.0>,<0.62.0>]},
>>           {dictionary,[]},
>>           {trap_exit,false},
>>           {status,waiting},
>>           {heap_size,233},
>>           {stack_size,10},
>>           {reductions,106}]}]]}}
>>
>> [Sun, 26 Oct 2008 19:17:13 GMT] [error] [<0.47.0>]  
>> {error_report,<0.22.0>,
>>   {<0.47.0>,supervisor_report,
>>    [{supervisor,{local,couch_server_sup}},
>>     {errorContext,child_terminated},
>>     {reason,
>>         {{badmatch,eof},
>>          [{couch_file,handle_call,3},
>>           {gen_server,handle_msg,5},
>>           {proc_lib,init_p,5}]}},
>>     {offender,
>>         [{pid,<0.63.0>},
>>          {name,"dev_script"},
>>          {mfa,
>>              {couch_db,open,
>>                  ["dev_script",
>>                   "/usr/local/var/lib/couchdb/dev_script.couch"]}},
>>          {restart_type,transient},
>>          {shutdown,infinity},
>>          {child_type,supervisor}]}]}}
>>
>> [Sun, 26 Oct 2008 19:17:13 GMT] [info] [<0.3347.0>] HTTP Error (code
>> 500): {'EXIT',
>>                          {{{badmatch,eof},
>>                            [{couch_file,handle_call,3},
>>                             {gen_server,handle_msg,5},
>>                             {proc_lib,init_p,5}]},
>>                           {gen_server,call,
>>                               [<0.62.0>,{pread_bin, 
>> 14207},infinity]}}}
>>
>> [Sun, 26 Oct 2008 19:17:13 GMT] [info] [<0.3347.0>] 127.0.0.1 - -  
>> "GET
>> /dev_script/_all/" 500
>>
>>> curl http://127.0.0.1:5984/ 
>>> dev_script/                                                ~
>> {"db_name":"dev_script","doc_count":1,"doc_del_count": 
>> 0,"update_seq":1,"compact_running":false,"disk_size":8192}
>>
>>> curl http:// 
>>> 127.0.0.1 
>>> :5984/                                                           ~
>> {"couchdb":"Welcome","version":"0.8.1-incubating"}
>>
>> All tests from the test suite succeed. Running CouchDB 0.8.1 on Mac  
>> OS
>> X 10.5.5 built from the official source package.
>>
>> --
>> With Respect,
>> Yuriy.
>>
>
>
>
> -- 
> Chris Anderson
> http://jchris.mfdz.com
>


Re: Error On Getting All Documents From The Databases

Posted by Chris Anderson <jc...@apache.org>.
Yuriy,

Try http://127.0.0.1:5984/dev_script/_all_docs

Hope that helps,

Chris

On Sun, Oct 26, 2008 at 12:20 PM, Yuriy Sazonets <yu...@sazonets.com> wrote:
> Hi,
>
> Not sure if this is a bug or I'm doing something wrong.
>
> I have two databases for testing called 'dev_offer' and 'dev_script'.
> I have inserted few (1 or 2) documents to the each of databases and
> after system crash I can't retrieve _all view for any of these
> databases:
>
>>curl http://127.0.0.1:5984/dev_script/_all/                                           ~
> {"error":"EXIT","reason":"{{{badmatch,eof},\n
> [{couch_file,handle_call,3},\n   {gen_server,handle_msg,5},\n
> {proc_lib,init_p,5}]},\n
> {gen_server,call,[<0.58.0>,{pread_bin,14207},infinity]}}"}
>
> Log:
>
> [Sun, 26 Oct 2008 19:17:13 GMT] [error] [<0.62.0>] ** Generic server
> <0.62.0> terminating
> ** Last message in was {pread_bin,14207}
> ** When Server state == {file_descriptor,prim_file,{#Port<0.142>,11}}
> ** Reason for termination ==
> ** {{badmatch,eof},
>    [{couch_file,handle_call,3},
>     {gen_server,handle_msg,5},
>     {proc_lib,init_p,5}]}
>
>
> [Sun, 26 Oct 2008 19:17:13 GMT] [error] [<0.62.0>] {error_report,<0.22.0>,
>    {<0.62.0>,crash_report,
>     [[{pid,<0.62.0>},
>       {registered_name,[]},
>       {error_info,
>           {exit,
>               {{badmatch,eof},
>                [{couch_file,handle_call,3},
>                 {gen_server,handle_msg,5},
>                 {proc_lib,init_p,5}]},
>               [{gen_server,terminate,6},{proc_lib,init_p,5}]}},
>       {initial_call,
>           {gen,init_it,
>               [gen_server,<0.47.0>,<0.47.0>,couch_file,
>                {"/usr/local/var/lib/couchdb/dev_script.couch",[],<0.47.0>},
>                []]}},
>       {ancestors,[couch_server_sup,<0.1.0>]},
>       {messages,[]},
>       {links,[<0.64.0>]},
>       {dictionary,[]},
>       {trap_exit,false},
>       {status,running},
>       {heap_size,377},
>       {stack_size,23},
>       {reductions,1233}],
>      [{neighbour,
>           [{pid,<0.65.0>},
>            {registered_name,[]},
>            {initial_call,
>                {gen,init_it,
>                    [gen_server,<0.64.0>,<0.64.0>,couch_stream,
>                     {{4165,9931},<0.62.0>},
>                     []]}},
>            {current_function,{gen_server,loop,6}},
>            {ancestors,[<0.64.0>]},
>            {messages,[]},
>            {links,[<0.64.0>]},
>            {dictionary,[]},
>            {trap_exit,false},
>            {status,waiting},
>            {heap_size,233},
>            {stack_size,12},
>            {reductions,37}]},
>       {neighbour,
>           [{pid,<0.63.0>},
>            {registered_name,[]},
>            {initial_call,
>                {gen,init_it,
>                    [gen_server,<0.47.0>,<0.47.0>,couch_db,
>                     {"dev_script",
>                      "/usr/local/var/lib/couchdb/dev_script.couch",<0.62.0>,
>                      []},
>                     []]}},
>            {current_function,{gen_server,loop,6}},
>            {ancestors,[couch_server_sup,<0.1.0>]},
>            {messages,[]},
>            {links,[<0.47.0>,<0.64.0>]},
>            {dictionary,[]},
>            {trap_exit,false},
>            {status,waiting},
>            {heap_size,610},
>            {stack_size,12},
>            {reductions,93}]},
>       {neighbour,
>           [{pid,<0.64.0>},
>            {registered_name,[]},
>            {initial_call,{couch_db,start_update_loop,2}},
>            {current_function,{couch_db,update_loop,1}},
>            {ancestors,[]},
>            {messages,[]},
>            {links,[<0.63.0>,<0.65.0>,<0.62.0>]},
>            {dictionary,[]},
>            {trap_exit,false},
>            {status,waiting},
>            {heap_size,233},
>            {stack_size,10},
>            {reductions,106}]}]]}}
>
> [Sun, 26 Oct 2008 19:17:13 GMT] [error] [<0.47.0>] {error_report,<0.22.0>,
>    {<0.47.0>,supervisor_report,
>     [{supervisor,{local,couch_server_sup}},
>      {errorContext,child_terminated},
>      {reason,
>          {{badmatch,eof},
>           [{couch_file,handle_call,3},
>            {gen_server,handle_msg,5},
>            {proc_lib,init_p,5}]}},
>      {offender,
>          [{pid,<0.63.0>},
>           {name,"dev_script"},
>           {mfa,
>               {couch_db,open,
>                   ["dev_script",
>                    "/usr/local/var/lib/couchdb/dev_script.couch"]}},
>           {restart_type,transient},
>           {shutdown,infinity},
>           {child_type,supervisor}]}]}}
>
> [Sun, 26 Oct 2008 19:17:13 GMT] [info] [<0.3347.0>] HTTP Error (code
> 500): {'EXIT',
>                           {{{badmatch,eof},
>                             [{couch_file,handle_call,3},
>                              {gen_server,handle_msg,5},
>                              {proc_lib,init_p,5}]},
>                            {gen_server,call,
>                                [<0.62.0>,{pread_bin,14207},infinity]}}}
>
> [Sun, 26 Oct 2008 19:17:13 GMT] [info] [<0.3347.0>] 127.0.0.1 - - "GET
> /dev_script/_all/" 500
>
>>curl http://127.0.0.1:5984/dev_script/                                                ~
> {"db_name":"dev_script","doc_count":1,"doc_del_count":0,"update_seq":1,"compact_running":false,"disk_size":8192}
>
>>curl http://127.0.0.1:5984/                                                           ~
> {"couchdb":"Welcome","version":"0.8.1-incubating"}
>
> All tests from the test suite succeed. Running CouchDB 0.8.1 on Mac OS
> X 10.5.5 built from the official source package.
>
> --
> With Respect,
> Yuriy.
>



-- 
Chris Anderson
http://jchris.mfdz.com

Re: Error On Getting All Documents From The Databases

Posted by Yuriy Sazonets <yu...@gmail.com>.
Oops, sorry, but still the same error :( Also getting this from
CouchDB http frontend.

On Mon, Oct 27, 2008 at 9:44 AM, Paul Carey <pa...@gmail.com> wrote:
> Not sure about the error message you got but I think you want
> _all_docs rather than _all
> curl http://127.0.0.1:5984/dev_script/_all_docs/
>
> Paul
>



-- 
With Respect,
Yuriy.

Re: Error On Getting All Documents From The Databases

Posted by Paul Carey <pa...@gmail.com>.
Not sure about the error message you got but I think you want
_all_docs rather than _all
curl http://127.0.0.1:5984/dev_script/_all_docs/

Paul