You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by "Filipe Manana (JIRA)" <ji...@apache.org> on 2010/07/13 23:57:50 UTC

[jira] Closed: (COUCHDB-575) CouchDB crashes and restarts when multiple databases are being compacted at once

     [ https://issues.apache.org/jira/browse/COUCHDB-575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Filipe Manana closed COUCHDB-575.
---------------------------------

    Resolution: Fixed

This issue was fixed by COUCHDB-761.

Thanks to Randall Leeds and Adam Kocoloski.

> CouchDB crashes and restarts when multiple databases are being compacted at once
> --------------------------------------------------------------------------------
>
>                 Key: COUCHDB-575
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-575
>             Project: CouchDB
>          Issue Type: Bug
>    Affects Versions: 0.11
>         Environment: {"couchdb":"Welcome","version":"0.11.0b1e2a54d1-git"}
> Gentoo 10.1, rebuilt as of Nov 12, 64-bit, linux kernel version 2.6.30
> Erlang built from source file otp_src_R13B02-1.tar.gz (Gentoo calls it 13.2.2)
>            Reporter: James Marca
>            Priority: Minor
>
> When I run compaction on multiple databases at once, CouchDB will crash and restart.  
> This happens on views and databases.  
> When compacting DBs, I had definite crashes when compacting just two databases.  When compacting views, I haven't yet seen a crash with just two views running, but have with at few as 5 views being compacted.
> My DBs and views are large, but not unreasonable.  The databases run around 23G each (after compaction).  The views are similarly sized:
> james@lysithia ~ $ ls -lrth /var/lib/couchdb/.d12_2007_02morehash_design/ 
> total 74G
> -rw-r--r-- 1 couchdb couchdb  25G 2009-11-18 11:29 172235a8f385d7dc0e0818e5d003aad2.view
> -rw-r--r-- 1 couchdb couchdb  13G 2009-11-19 19:30 2657851bc558aef6b89d05361193fc5e.view
> -rw-r--r-- 1 couchdb couchdb  44M 2009-11-23 14:18 172235a8f385d7dc0e0818e5d003aad2.compact.view
> -rw-r--r-- 1 couchdb couchdb 3.9M 2009-11-23 14:18 2657851bc558aef6b89d05361193fc5e.compact.view
> -rw-r--r-- 1 couchdb couchdb  37G 2009-11-23 14:34 433c5bd5313e5509b96b67a6eb3d1145.view
> -rw-r--r-- 1 couchdb couchdb  57M 2009-11-23 14:43 433c5bd5313e5509b96b67a6eb3d1145.compact.view
> [Mon, 23 Nov 2009 23:07:46 GMT] [debug] [<0.92.0>] Spawning new group server for view group _design/summary in database d12_2007_06morehash.
> [Mon, 23 Nov 2009 23:07:46 GMT] [debug] [<0.80.0>] New task status for d12_2007_05morehash/summary3: Copied 20000 of 909614 Ids (2%)
> [Mon, 23 Nov 2009 23:07:48 GMT] [info] [<0.479.0>] 127.0.0.1 - - 'POST' /d12_2007_06morehash/_compact/summary 202
> [Mon, 23 Nov 2009 23:07:48 GMT] [info] [<0.520.0>] View index compaction starting for d12_2007_06morehash _design/summary
> [Mon, 23 Nov 2009 23:07:49 GMT] [debug] [<0.520.0>] Resetting group index "_design/summary" in db d12_2007_06morehash
> [Mon, 23 Nov 2009 23:07:49 GMT] [debug] [<0.80.0>] New task status for d12_2007_06morehash _design/summary: Processed 0 of 23 changes (0%)
> [Mon, 23 Nov 2009 23:07:49 GMT] [debug] [<0.80.0>] New task status for d12_2007_06morehash/summary: Copied 0 of 901450 Ids (0%)
> [Mon, 23 Nov 2009 23:07:49 GMT] [debug] [<0.80.0>] New task status for d12_2007_06morehash _design/summary: Processed 12 of 23 changes (52%)
> [Mon, 23 Nov 2009 23:07:50 GMT] [debug] [<0.80.0>] New task status for d12_2007_06morehash _design/summary: Processed 18 of 23 changes (78%)
> [Mon, 23 Nov 2009 23:07:50 GMT] [debug] [<0.80.0>] New task status for d12_2007_06morehash _design/summary: Finishing.
> [Mon, 23 Nov 2009 23:07:50 GMT] [debug] [<0.80.0>] New task status for d12_2007_05morehash/summary3: Copied 30000 of 909614 Ids (3%)
> [Mon, 23 Nov 2009 23:07:52 GMT] [debug] [<0.80.0>] New task status for d12_2007_05morehash/summary3: Copied 40000 of 909614 Ids (4%)
> [Mon, 23 Nov 2009 23:07:53 GMT] [debug] [<0.80.0>] New task status for d12_2007_02morehash/summary3: Copied 110000 of 874175 Ids (12%)
> [Mon, 23 Nov 2009 23:07:53 GMT] [debug] [<0.80.0>] New task status for d12_2007_05morehash/summary3: Copied 50000 of 909614 Ids (5%)
> [Mon, 23 Nov 2009 23:07:54 GMT] [info] [<0.520.0>] checkpointing view update at seq 1156739 for d12_2007_06morehash _design/summary
> [Mon, 23 Nov 2009 23:08:05 GMT] [debug] [<0.80.0>] New task status for d12_2007_04morehash/summary3: Copied 90000 of 983989 Ids (9%)
> [Mon, 23 Nov 2009 23:08:05 GMT] [debug] [<0.80.0>] New task status for d12_2007_02morehash/summary2: Copied 50000 of 874175 Ids (5%)
> [Mon, 23 Nov 2009 23:08:05 GMT] [error] [<0.80.0>] ** Generic server couch_task_status terminating 
> ** Last message in was {#Ref<0.0.3.46775>,1}
> ** When Server state == nil
> ** Reason for termination == 
> ** {function_clause,
>        [{couch_task_status,handle_info,[{#Ref<0.0.3.46775>,1},nil]},
>         {gen_server,handle_msg,5},
>         {proc_lib,init_p_do_apply,3}]}
> [Mon, 23 Nov 2009 23:08:05 GMT] [error] [<0.80.0>] {error_report,<0.29.0>,
>     {<0.80.0>,crash_report,
>      [[{initial_call,{couch_task_status,init,['Argument__1']}},
>        {pid,<0.80.0>},
>        {registered_name,couch_task_status},
>        {error_info,
>            {exit,
>                {function_clause,
>                    [{couch_task_status,handle_info,
>                         [{#Ref<0.0.3.46775>,1},nil]},
>                     {gen_server,handle_msg,5},
>                     {proc_lib,init_p_do_apply,3}]},
>                [{gen_server,terminate,6},{proc_lib,init_p_do_apply,3}]}},
>        {ancestors,[couch_primary_services,couch_server_sup,<0.30.0>]},
>        {messages,[]},
>        {links,[<0.76.0>]},
>        {dictionary,[]},
>        {trap_exit,false},
>        {status,running},
>        {heap_size,1597},
>        {stack_size,24},
>        {reductions,3059}],
>       []]}}
> [Mon, 23 Nov 2009 23:08:05 GMT] [error] [<0.76.0>] {error_report,<0.29.0>,
>     {<0.76.0>,supervisor_report,
>      [{supervisor,{local,couch_primary_services}},
>       {errorContext,child_terminated},
>       {reason,
>           {function_clause,
>               [{couch_task_status,handle_info,[{#Ref<0.0.3.46775>,1},nil]},
>                {gen_server,handle_msg,5},
>                {proc_lib,init_p_do_apply,3}]}},
>       {offender,
>           [{pid,<0.80.0>},
>            {name,couch_task_status},
>            {mfa,{couch_task_status,start_link,[]}},
>            {restart_type,permanent},
>            {shutdown,brutal_kill},
>            {child_type,worker}]}]}}
> [Mon, 23 Nov 2009 23:08:05 GMT] [error] [<0.19.0>] {error_report,<0.7.0>,
>               {<0.19.0>,std_error,
>                "File operation error: eacces. Target: ./lib.beam. Function: get_file. Process: code_server."}}
> [Mon, 23 Nov 2009 23:08:05 GMT] [error] [<0.19.0>] {error_report,<0.7.0>,
>               {<0.19.0>,std_error,
>                "File operation error: eacces. Target: ./erl_internal.beam. Function: get_file. Process: code_server."}}
> [Mon, 23 Nov 2009 23:08:05 GMT] [error] [<0.553.0>] ** Generic server couch_task_status terminating 
> ** Last message in was {'$gen_cast',
>                            {update_status,<0.494.0>,
>                                <<"Copied 60000 of 909614 Ids (6%)">>}}
> ** When Server state == nil
> ** Reason for termination == 
> ** {{badmatch,[]},
>     [{couch_task_status,handle_cast,2},
>      {gen_server,handle_msg,5},
>      {proc_lib,init_p_do_apply,3}]}
> [Mon, 23 Nov 2009 23:08:05 GMT] [error] [<0.553.0>] {error_report,<0.29.0>,
>               {<0.553.0>,crash_report,
>                [[{initial_call,{couch_task_status,init,['Argument__1']}},
>                  {pid,<0.553.0>},
>                  {registered_name,couch_task_status},
>                  {error_info,{exit,{{badmatch,[]},
>                                     [{couch_task_status,handle_cast,2},
>                                      {gen_server,handle_msg,5},
>                                      {proc_lib,init_p_do_apply,3}]},
>                                    [{gen_server,terminate,6},
>                                     {proc_lib,init_p_do_apply,3}]}},
>                  {ancestors,[couch_primary_services,couch_server_sup,
>                              <0.30.0>]},
>                  {messages,[]},
>                  {links,[<0.76.0>]},
>                  {dictionary,[]},
>                  {trap_exit,false},
>                  {status,running},
>                  {heap_size,377},
>                  {stack_size,24},
>                  {reductions,127}],
>                 []]}}
> [Mon, 23 Nov 2009 23:08:05 GMT] [error] [<0.76.0>] {error_report,<0.29.0>,
>               {<0.76.0>,supervisor_report,
>                [{supervisor,{local,couch_primary_services}},
>                 {errorContext,child_terminated},
>                 {reason,{{badmatch,[]},
>                          [{couch_task_status,handle_cast,2},
>                           {gen_server,handle_msg,5},
>                           {proc_lib,init_p_do_apply,3}]}},
>                 {offender,[{pid,<0.553.0>},
>                            {name,couch_task_status},
>                            {mfa,{couch_task_status,start_link,[]}},
>                            {restart_type,permanent},
>                            {shutdown,brutal_kill},
>                            {child_type,worker}]}]}}
> [Mon, 23 Nov 2009 23:08:07 GMT] [error] [<0.555.0>] ** Generic server couch_task_status terminating 
> ** Last message in was {'$gen_cast',
>                            {update_status,<0.458.0>,
>                                <<"Copied 100000 of 983989 Ids (10%)">>}}
> ** When Server state == nil
> ** Reason for termination == 
> ** {{badmatch,[]},
>     [{couch_task_status,handle_cast,2},
>      {gen_server,handle_msg,5},
>      {proc_lib,init_p_do_apply,3}]}
> and so on.  I can post more but I don't know what I'm looking at or what is helpful

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.