You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@couchdb.apache.org by Gene <ge...@iconcmo.com> on 2016/04/15 17:02:29 UTC

Help with CouchDB Crash logs

We are using CouchDB 1.6.1/CentOS Linux release 7.0.1406. CouchDB was installed using `yum`.

We tried to run data conversion on some 100 databases. Most databases have less than 1500 documents (around 1MB) except for 3 which have around 200,000 documents (around 250 MB). Conversion ran fine on few databases then we started seeing `Error: connect ECONNREFUSED 127.0.0.1:5984` errors. 

Conversion steps:

Replicate `database_1` to `database_1_backup`.
Delete `database_1`.
Recreate `database_1`.
Read documents from `database_1_backup` in memory.
Write to `database_1` using bulkDocs.

Crash log:

[Wed, 13 Apr 2016 21:05:06 GMT] [info] [<0.2715.524>] starting new replication `27dd24d1bd28e13225559e3e0a6c275a` at <0.5681.524> (`database_1` -> `database_1_backup`)
[Wed, 13 Apr 2016 21:05:07 GMT] [info] [<0.5681.524>] recording a checkpoint for `database_1` -> `database_1_backup` at source update_seq 2209
[Wed, 13 Apr 2016 21:05:07 GMT] [info] [<0.2715.524>] <ip.address> - - POST /_replicate 200
[Wed, 13 Apr 2016 21:05:07 GMT] [info] [<0.31752.523>] <ip.address> - - GET /database_1_backup/ 200
[Wed, 13 Apr 2016 21:05:07 GMT] [info] [<0.10914.524>] <ip.address> - - GET /database_1/ 200
[Wed, 13 Apr 2016 21:05:07 GMT] [info] [<0.2623.524>] <ip.address> - - GET /database_1/ 200
[Wed, 13 Apr 2016 21:05:07 GMT] [info] [<0.7567.524>] <ip.address> - - DELETE /database_1/ 200
[Wed, 13 Apr 2016 21:05:07 GMT] [error] [<0.137.0>] ** Generic server couch_index_server terminating
** Last message in was {'$gen_cast',{reset_indexes,<<"database_1">>}}
** When Server state == {st,"/var/lib/couchdb"}
** Reason for termination ==
** {{badmatch,{error,eacces}},
    [{couch_file,nuke_dir,2,[{file,"couch_file.erl"},{line,237}]},
     {couch_file,'-nuke_dir/2-fun-0-',3,[{file,"couch_file.erl"},{line,228}]},
     {lists,foreach,2,[{file,"lists.erl"},{line,1323}]},
     {couch_file,nuke_dir,2,[{file,"couch_file.erl"},{line,236}]},
     {couch_index_server,hafndle_cast,2,
                         [{file,"src/couch_index_server.erl"},{line,117}]},
     {gen_server,handle_msg,5,[{file,"gen_server.erl"},{line,604}]},
     {proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,239}]}]}

Our seconds attempt to re-run the conversion completely crashed couchDB.

[Wed, 13 Apr 2016 22:17:19 GMT] [info] [<0.19197.0>] starting new replication `6fe446668153db8635e9f49ddd8895f2` at <0.20012.0> (`database_2` -> `database_2`)
[Wed, 13 Apr 2016 22:17:19 GMT] [info] [<0.20012.0>] recording a checkpoint for `database_2` -> `database_2` at source update_seq 1631
[Wed, 13 Apr 2016 22:17:19 GMT] [error] [<0.20012.0>] Replication `6fe446668153db8635e9f49ddd8895f2` (`database_2` -> `database_2`) failed: {checkpoint_commit_failure,<<"Error updating the target checkpoint document: conflict">>}
[Wed, 13 Apr 2016 22:17:19 GMT] [error] [<0.20012.0>] ** Generic server <0.20012.0> terminating
** Last message in was {'EXIT',<0.20027.0>,normal}
** When Server state == {rep_state,
                         {rep,
                          {"6fe446668153db8635e9f49ddd8895f2",[]},
                          <<"database_2">>,<<"database_2">>,
                          [{checkpoint_interval,5000},
                           {connection_timeout,30000},
                           {http_connections,20},
                           {retries,10},
                           {socket_options,[{keepalive,true},{nodelay,false}]},
                           {use_checkpoints,true},
                           {worker_batch_size,500},
                           {worker_processes,4}],


erl_crash.dump - https://paste.ee/r/EWRYV <https://paste.ee/r/EWRYV>

SeLinux is not an issue here, at least not this time. 

Any help would be greatly appreciated debugging this crash log.

Thanks,

Sajin Shrestha

Re: Help with CouchDB Crash logs

Posted by Jan Lehnardt <ja...@apache.org>.

** {{badmatch,{error,eacces}},

this means permission issues. Make sure everything in CouchDB’s database_dir and view_index–dir is read/writeable by the user that your CouchDB instance runs under.

Best
Jan
--
Professional Support for Apache CouchDB:
https://neighbourhood.ie/couchdb-support/


> On 15 Apr 2016, at 17:02, Gene <ge...@iconcmo.com> wrote:
> 
> We are using CouchDB 1.6.1/CentOS Linux release 7.0.1406. CouchDB was installed using `yum`.
> 
> We tried to run data conversion on some 100 databases. Most databases have less than 1500 documents (around 1MB) except for 3 which have around 200,000 documents (around 250 MB). Conversion ran fine on few databases then we started seeing `Error: connect ECONNREFUSED 127.0.0.1:5984` errors. 
> 
> Conversion steps:
> 
> Replicate `database_1` to `database_1_backup`.
> Delete `database_1`.
> Recreate `database_1`.
> Read documents from `database_1_backup` in memory.
> Write to `database_1` using bulkDocs.
> 
> Crash log:
> 
> [Wed, 13 Apr 2016 21:05:06 GMT] [info] [<0.2715.524>] starting new replication `27dd24d1bd28e13225559e3e0a6c275a` at <0.5681.524> (`database_1` -> `database_1_backup`)
> [Wed, 13 Apr 2016 21:05:07 GMT] [info] [<0.5681.524>] recording a checkpoint for `database_1` -> `database_1_backup` at source update_seq 2209
> [Wed, 13 Apr 2016 21:05:07 GMT] [info] [<0.2715.524>] <ip.address> - - POST /_replicate 200
> [Wed, 13 Apr 2016 21:05:07 GMT] [info] [<0.31752.523>] <ip.address> - - GET /database_1_backup/ 200
> [Wed, 13 Apr 2016 21:05:07 GMT] [info] [<0.10914.524>] <ip.address> - - GET /database_1/ 200
> [Wed, 13 Apr 2016 21:05:07 GMT] [info] [<0.2623.524>] <ip.address> - - GET /database_1/ 200
> [Wed, 13 Apr 2016 21:05:07 GMT] [info] [<0.7567.524>] <ip.address> - - DELETE /database_1/ 200
> [Wed, 13 Apr 2016 21:05:07 GMT] [error] [<0.137.0>] ** Generic server couch_index_server terminating
> ** Last message in was {'$gen_cast',{reset_indexes,<<"database_1">>}}
> ** When Server state == {st,"/var/lib/couchdb"}
> ** Reason for termination ==
> ** {{badmatch,{error,eacces}},
>    [{couch_file,nuke_dir,2,[{file,"couch_file.erl"},{line,237}]},
>     {couch_file,'-nuke_dir/2-fun-0-',3,[{file,"couch_file.erl"},{line,228}]},
>     {lists,foreach,2,[{file,"lists.erl"},{line,1323}]},
>     {couch_file,nuke_dir,2,[{file,"couch_file.erl"},{line,236}]},
>     {couch_index_server,hafndle_cast,2,
>                         [{file,"src/couch_index_server.erl"},{line,117}]},
>     {gen_server,handle_msg,5,[{file,"gen_server.erl"},{line,604}]},
>     {proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,239}]}]}
> 
> Our seconds attempt to re-run the conversion completely crashed couchDB.
> 
> [Wed, 13 Apr 2016 22:17:19 GMT] [info] [<0.19197.0>] starting new replication `6fe446668153db8635e9f49ddd8895f2` at <0.20012.0> (`database_2` -> `database_2`)
> [Wed, 13 Apr 2016 22:17:19 GMT] [info] [<0.20012.0>] recording a checkpoint for `database_2` -> `database_2` at source update_seq 1631
> [Wed, 13 Apr 2016 22:17:19 GMT] [error] [<0.20012.0>] Replication `6fe446668153db8635e9f49ddd8895f2` (`database_2` -> `database_2`) failed: {checkpoint_commit_failure,<<"Error updating the target checkpoint document: conflict">>}
> [Wed, 13 Apr 2016 22:17:19 GMT] [error] [<0.20012.0>] ** Generic server <0.20012.0> terminating
> ** Last message in was {'EXIT',<0.20027.0>,normal}
> ** When Server state == {rep_state,
>                         {rep,
>                          {"6fe446668153db8635e9f49ddd8895f2",[]},
>                          <<"database_2">>,<<"database_2">>,
>                          [{checkpoint_interval,5000},
>                           {connection_timeout,30000},
>                           {http_connections,20},
>                           {retries,10},
>                           {socket_options,[{keepalive,true},{nodelay,false}]},
>                           {use_checkpoints,true},
>                           {worker_batch_size,500},
>                           {worker_processes,4}],
> 
> 
> erl_crash.dump - https://paste.ee/r/EWRYV <https://paste.ee/r/EWRYV>
> 
> SeLinux is not an issue here, at least not this time. 
> 
> Any help would be greatly appreciated debugging this crash log.
> 
> Thanks,
> 
> Sajin Shrestha
>