You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by "Dave Cottlehuber (JIRA)" <ji...@apache.org> on 2013/04/03 13:21:16 UTC
[jira] [Closed] (COUCHDB-1757) CouchDB 1.3.0rc3 crashes when
_replicator contains a lot of docs
[ https://issues.apache.org/jira/browse/COUCHDB-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dave Cottlehuber closed COUCHDB-1757.
-------------------------------------
Assignee: Dave Cottlehuber
> CouchDB 1.3.0rc3 crashes when _replicator contains a lot of docs
> ----------------------------------------------------------------
>
> Key: COUCHDB-1757
> URL: https://issues.apache.org/jira/browse/COUCHDB-1757
> Project: CouchDB
> Issue Type: Bug
> Components: Database Core
> Reporter: Sander Dijkhuis
> Assignee: Dave Cottlehuber
>
> I’m deploying an experimental game based on CouchDB with one user per database. For access control, I’m using several _replicator docs per user:
> - one filtered replication from the shared db to the user db,
> - one unfiltered replication from the user db to the shared db,
> - two replications using doc_ids per ‘friendship’ (to share both profiles).
> At the moment, this results in 420 continuous replications running. CouchDB 1.3.0rc3 on Ubuntu crashes a couple of seconds after starting, and doesn’t crash when I temporarily remove the _replicator database. When I used 1.3.0rc1, CouchDB would crash after a few minutes to a few hours.
> Some details from the crash report are below, filtered for privacy, to avoid repetition and to hide the _design doc that’s shown in the log. Let me know if you need more detail or if I should share one of the _design functions used.
> Am I abusing the replication system, or can I change a setting to allow for longer timeouts?
> --
> First, I get something like this for each _replicator doc:
> {code}
> [info] [<0.5368.0>] Replication `"5529b4bdb9c5bdc15b558bd7588511d9+continuous"` is using:
> 4 worker processes
> a worker batch size of 500
> 20 HTTP connections
> a connection timeout of 30000 milliseconds
> 10 retries per request
> socket options are: [{keepalive,true},{nodelay,false}]
> source start sequence 6908
> [info] [<0.5368.0>] Document `lunacy:to:USERNAME` triggered replication `5529b4bdb9c5bdc15b558bd7588511d9+continuous`
> [info] [<0.1213.0>] starting new replication `5529b4bdb9c5bdc15b558bd7588511d9+continuous` at <0.5368.0> (`lunacy` -> `lunacy/user/USERNAME`)
> {code}
> Then:
> {code}
> [error] [<0.5408.0>] OS Process died with status: 137
> [error] [<0.5408.0>] ** Generic server <0.5408.0> terminating
> ** Last message in was {#Port<0.2740>,{exit_status,137}}
> ** When Server state == {os_proc,"/home/sander/git/apache-couchdb-1.3.0/build/bin/couchjs /home/sander/git/apache-couchdb-1.3.0/build/share/couchdb/server/main.js",
> #Port<0.2740>,
> #Fun<couch_os_process.2.132569728>,
> #Fun<couch_os_process.3.35601548>,5000}
> ** Reason for termination ==
> ** {exit_status,137}
> {code}
> Followed by:
> {code}
> =ERROR REPORT==== 2-Apr-2013::19:18:20 ===
> ** Generic server <0.5408.0> terminating
> ** Last message in was {#Port<0.2740>,{exit_status,137}}
> ** When Server state == {os_proc,"/home/sander/git/apache-couchdb-1.3.0/build/bin/couchjs /home/sander/git/apache-couchdb-1.3.0/build/share/couchdb/server/main.js",
> #Port<0.2740>,
> #Fun<couch_os_process.2.132569728>,
> #Fun<couch_os_process.3.35601548>,5000}
> ** Reason for termination ==
> ** {exit_status,137}
> [error] [<0.5408.0>] {error_report,<0.31.0>,
> {<0.5408.0>,crash_report,
> [[{initial_call,
> {couch_os_process,init,['Argument__1']}},
> {pid,<0.5408.0>},
> {registered_name,[]},
> {error_info,
> {exit,
> {exit_status,137},
> [{gen_server,terminate,6},
> {proc_lib,init_p_do_apply,3}]}},
> {ancestors,
> [couch_query_servers,couch_secondary_services,
> couch_server_sup,<0.32.0>]},
> {messages,[]},
> {links,[<0.111.0>,<0.5339.0>]},
> {dictionary,[]},
> {trap_exit,false},
> {status,running},
> {heap_size,1597},
> {stack_size,24},
> {reductions,1197}],
> [{neighbour,
> [{pid,<0.5345.0>},
> {registered_name,[]},
> {initial_call,
> {couch_event_sup,init,['Argument__1']}},
> {current_function,{gen_server,loop,6}},
> {ancestors,[<0.5339.0>]},
> {messages,[]},
> {links,[<0.5339.0>,<0.89.0>]},
> {dictionary,[]},
> {trap_exit,false},
> {status,waiting},
> {heap_size,987},
> {stack_size,9},
> {reductions,32}]},
> {neighbour,
> [{pid,<0.5339.0>},
> {registered_name,[]},
> {initial_call,{erlang,apply,2}},
> {current_function,{gen,do_call,4}},
> {ancestors,[]},
> {messages,[]},
> {links,[<0.5345.0>,<0.5408.0>,<0.5335.0>]},
> {dictionary,[]},
> {trap_exit,false},
> {status,waiting},
> {heap_size,6765},
> {stack_size,104},
> {reductions,1988}]}]]}}
> =CRASH REPORT==== 2-Apr-2013::19:18:21 ===
> crasher:
> initial call: couch_os_process:init/1
> pid: <0.5408.0>
> registered_name: []
> exception exit: {exit_status,137}
> in function gen_server:terminate/6
> ancestors: [couch_query_servers,couch_secondary_services,
> couch_server_sup,<0.32.0>]
> messages: []
> links: [<0.111.0>,<0.5339.0>]
> dictionary: []
> trap_exit: false
> status: running
> heap_size: 1597
> stack_size: 24
> reductions: 1197
> neighbours:
> neighbour: [{pid,<0.5345.0>},
> {registered_name,[]},
> {initial_call,{couch_event_sup,init,['Argument__1']}},
> {current_function,{gen_server,loop,6}},
> {ancestors,[<0.5339.0>]},
> {messages,[]},
> {links,[<0.5339.0>,<0.89.0>]},
> {dictionary,[]},
> {trap_exit,false},
> {status,waiting},
> {heap_size,987},
> {stack_size,9},
> {reductions,32}]
> neighbour: [{pid,<0.5339.0>},
> {registered_name,[]},
> {initial_call,{erlang,apply,2}},
> {current_function,{gen,do_call,4}},
> {ancestors,[]},
> {messages,[]},
> {links,[<0.5345.0>,<0.5408.0>,<0.5335.0>]},
> {dictionary,[]},
> {trap_exit,false},
> {status,waiting},
> {heap_size,6765},
> {stack_size,104},
> {reductions,1988}]
> [error] [<0.5335.0>] ChangesReader process died with reason: {exit_status,137}
> [error] [<0.111.0>] OS Process Error <0.5412.0> :: {os_process_error,
> "OS process timed out."}
> [error] [<0.5387.0>] OS Process died with status: 137
> [error] [<0.5385.0>] OS Process died with status: 137
> [error] [<0.5335.0>] Replication `f7ecf7f435811899c912619f899f24b4+continuous` (`lunacy` -> `lunacy/user/USERNAME`) failed: changes_reader_died
> [error] [<0.5258.0>] ChangesReader process died with reason: shutdown
> [error] [<0.5387.0>] ** Generic server <0.5387.0> terminating
> ** Last message in was {#Port<0.2730>,{exit_status,137}}
> ** When Server state == {os_proc,"/home/sander/git/apache-couchdb-1.3.0/build/bin/couchjs /home/sander/git/apache-couchdb-1.3.0/build/share/couchdb/server/main.js",
> #Port<0.2730>,
> #Fun<couch_os_process.2.132569728>,
> #Fun<couch_os_process.3.35601548>,5000}
> ** Reason for termination ==
> ** {exit_status,137}
> =ERROR REPORT==== 2-Apr-2013::19:18:21 ===
> ** Generic server <0.5387.0> terminating
> ** Last message in was {#Port<0.2730>,{exit_status,137}}
> ** When Server state == {os_proc,"/home/sander/git/apache-couchdb-1.3.0/build/bin/couchjs /home/sander/git/apache-couchdb-1.3.0/build/share/couchdb/server/main.js",
> #Port<0.2730>,
> #Fun<couch_os_process.2.132569728>,
> #Fun<couch_os_process.3.35601548>,5000}
> ** Reason for termination ==
> ** {exit_status,137}
> [error] [<0.5385.0>] ** Generic server <0.5385.0> terminating
> ** Last message in was {#Port<0.2729>,{exit_status,137}}
> ** When Server state == {os_proc,"/home/sander/git/apache-couchdb-1.3.0/build/bin/couchjs /home/sander/git/apache-couchdb-1.3.0/build/share/couchdb/server/main.js",
> #Port<0.2729>,
> #Fun<couch_os_process.2.132569728>,
> #Fun<couch_os_process.3.35601548>,5000}
> ** Reason for termination ==
> ** {exit_status,137}
> =ERROR REPORT==== 2-Apr-2013::19:18:21 ===
> ** Generic server <0.5385.0> terminating
> ** Last message in was {#Port<0.2729>,{exit_status,137}}
> ** When Server state == {os_proc,"/home/sander/git/apache-couchdb-1.3.0/build/bin/couchjs /home/sander/git/apache-couchdb-1.3.0/build/share/couchdb/server/main.js",
> #Port<0.2729>,
> #Fun<couch_os_process.2.132569728>,
> #Fun<couch_os_process.3.35601548>,5000}
> ** Reason for termination ==
> ** {exit_status,137}
> [error] [<0.5385.0>] {error_report,<0.31.0>,
> {<0.5385.0>,crash_report,
> [[{initial_call,
> {couch_os_process,init,['Argument__1']}},
> {pid,<0.5385.0>},
> {registered_name,[]},
> {error_info,
> {exit,
> {exit_status,137},
> [{gen_server,terminate,6},
> {proc_lib,init_p_do_apply,3}]}},
> {ancestors,
> [couch_query_servers,couch_secondary_services,
> couch_server_sup,<0.32.0>]},
> {messages,[]},
> {links,[<0.111.0>,<0.5207.0>]},
> {dictionary,[]},
> {trap_exit,false},
> {status,running},
> {heap_size,1597},
> {stack_size,24},
> {reductions,1205}],
> [{neighbour,
> [{pid,<0.5213.0>},
> {registered_name,[]},
> {initial_call,
> {couch_event_sup,init,['Argument__1']}},
> {current_function,{gen_server,loop,6}},
> {ancestors,[<0.5207.0>]},
> {messages,[]},
> {links,[<0.5207.0>,<0.89.0>]},
> {dictionary,[]},
> {trap_exit,false},
> {status,waiting},
> {heap_size,987},
> {stack_size,9},
> {reductions,32}]},
> {neighbour,
> [{pid,<0.5207.0>},
> {registered_name,[]},
> {initial_call,{erlang,apply,2}},
> {current_function,{gen,do_call,4}},
> {ancestors,[]},
> {messages,[]},
> {links,[<0.5213.0>,<0.5385.0>,<0.5203.0>]},
> {dictionary,[]},
> {trap_exit,false},
> {status,waiting},
> {heap_size,6765},
> {stack_size,104},
> {reductions,1988}]}]]}}
> =CRASH REPORT==== 2-Apr-2013::19:18:22 ===
> crasher:
> initial call: couch_os_process:init/1
> pid: <0.5385.0>
> registered_name: []
> exception exit: {exit_status,137}
> in function gen_server:terminate/6
> ancestors: [couch_query_servers,couch_secondary_services,
> couch_server_sup,<0.32.0>]
> messages: []
> links: [<0.111.0>,<0.5207.0>]
> dictionary: []
> trap_exit: false
> status: running
> heap_size: 1597
> stack_size: 24
> reductions: 1205
> neighbours:
> neighbour: [{pid,<0.5213.0>},
> {registered_name,[]},
> {initial_call,{couch_event_sup,init,['Argument__1']}},
> {current_function,{gen_server,loop,6}},
> {ancestors,[<0.5207.0>]},
> {messages,[]},
> {links,[<0.5207.0>,<0.89.0>]},
> {dictionary,[]},
> {trap_exit,false},
> {status,waiting},
> {heap_size,987},
> {stack_size,9},
> {reductions,32}]
> neighbour: [{pid,<0.5207.0>},
> {registered_name,[]},
> {initial_call,{erlang,apply,2}},
> {current_function,{gen,do_call,4}},
> {ancestors,[]},
> {messages,[]},
> {links,[<0.5213.0>,<0.5385.0>,<0.5203.0>]},
> {dictionary,[]},
> {trap_exit,false},
> {status,waiting},
> {heap_size,6765},
> {stack_size,104},
> {reductions,1988}]
> [error] [<0.5387.0>] {error_report,<0.31.0>,
> {<0.5387.0>,crash_report,
> [[{initial_call,
> {couch_os_process,init,['Argument__1']}},
> {pid,<0.5387.0>},
> {registered_name,[]},
> {error_info,
> {exit,
> {exit_status,137},
> [{gen_server,terminate,6},
> {proc_lib,init_p_do_apply,3}]}},
> {ancestors,
> [couch_query_servers,couch_secondary_services,
> couch_server_sup,<0.32.0>]},
> {messages,[]},
> {links,[<0.111.0>,<0.5218.0>]},
> {dictionary,[]},
> {trap_exit,false},
> {status,running},
> {heap_size,1597},
> {stack_size,24},
> {reductions,1205}],
> [{neighbour,
> [{pid,<0.5224.0>},
> {registered_name,[]},
> {initial_call,
> {couch_event_sup,init,['Argument__1']}},
> {current_function,{gen_server,loop,6}},
> {ancestors,[<0.5218.0>]},
> {messages,[]},
> {links,[<0.5218.0>,<0.89.0>]},
> {dictionary,[]},
> {trap_exit,false},
> {status,waiting},
> {heap_size,987},
> {stack_size,9},
> {reductions,32}]},
> {neighbour,
> [{pid,<0.5218.0>},
> {registered_name,[]},
> {initial_call,{erlang,apply,2}},
> {current_function,{gen,do_call,4}},
> {ancestors,[]},
> {messages,[]},
> {links,[<0.5224.0>,<0.5387.0>,<0.5214.0>]},
> {dictionary,[]},
> {trap_exit,false},
> {status,waiting},
> {heap_size,6765},
> {stack_size,104},
> {reductions,1947}]}]]}}
> =CRASH REPORT==== 2-Apr-2013::19:18:24 ===
> crasher:
> initial call: couch_os_process:init/1
> pid: <0.5387.0>
> registered_name: []
> exception exit: {exit_status,137}
> in function gen_server:terminate/6
> ancestors: [couch_query_servers,couch_secondary_services,
> couch_server_sup,<0.32.0>]
> messages: []
> links: [<0.111.0>,<0.5218.0>]
> dictionary: []
> trap_exit: false
> status: running
> heap_size: 1597
> stack_size: 24
> reductions: 1205
> neighbours:
> neighbour: [{pid,<0.5224.0>},
> {registered_name,[]},
> {initial_call,{couch_event_sup,init,['Argument__1']}},
> {current_function,{gen_server,loop,6}},
> {ancestors,[<0.5218.0>]},
> {messages,[]},
> {links,[<0.5218.0>,<0.89.0>]},
> {dictionary,[]},
> {trap_exit,false},
> {status,waiting},
> {heap_size,987},
> {stack_size,9},
> {reductions,32}]
> neighbour: [{pid,<0.5218.0>},
> {registered_name,[]},
> {initial_call,{erlang,apply,2}},
> {current_function,{gen,do_call,4}},
> {ancestors,[]},
> {messages,[]},
> {links,[<0.5224.0>,<0.5387.0>,<0.5214.0>]},
> {dictionary,[]},
> {trap_exit,false},
> {status,waiting},
> {heap_size,6765},
> {stack_size,104},
> {reductions,1947}]
> [error] [<0.5302.0>] ChangesReader process died with reason: shutdown
> [error] [<0.5192.0>] ChangesReader process died with reason: shutdown
> [error] [<0.5203.0>] ChangesReader process died with reason: {exit_status,137}
> [error] [<0.5214.0>] ChangesReader process died with reason: {exit_status,137}
> [error] [<0.3692.0>] ChangesReader process died with reason: shutdown
> [error] [<0.5258.0>] Replication `3d6539a2a9e3201a6eacd0b7db4c7dd3+continuous` (`lunacy` -> `lunacy/user/USERNAME`) failed: changes_reader_died
> [error] [<0.5170.0>] ChangesReader process died with reason: shutdown
> [error] [<0.5236.0>] ChangesReader process died with reason: shutdown
> [error] [<0.5280.0>] ChangesReader process died with reason: shutdown
> [error] [<0.5225.0>] ChangesReader process died with reason: shutdown
> [error] [<0.5324.0>] ChangesReader process died with reason: shutdown
> [error] [<0.5291.0>] ChangesReader process died with reason: shutdown
> [error] [<0.5313.0>] ChangesReader process died with reason: shutdown
> [error] [<0.5181.0>] ChangesReader process died with reason: shutdown
> [error] [<0.5269.0>] ChangesReader process died with reason: shutdown
> [error] [<0.111.0>] ** Generic server couch_query_servers terminating
> ** Last message in was {get_proc,{doc,<<"_design/server">>,
> {31,
> [<<2,129,73,127,145,177,85,156,51,70,79,
> 122,210,226,20,220>>, (ET CETERA)
> [],false,[]},
> {<<"_design/server">>,
> <<"31-0281497f91b1559c33464f7ad2e214dc">>}}
> ** When Server state == {qserver,32811,41005,45102,36908,[],
> {[{<<"reduce_limit">>,true},
> {<<"timeout">>,5000}]}}
> ** Reason for termination ==
> ** {bad_return_value,{os_process_error,"OS process timed out."}}
> {code}
> And finally:
> {code}
> {'$gen_call',
> {<0.3696.0>,#Ref<0.0.0.31225>},
> {unlink_proc,<0.3714.0>}},
> {'$gen_call',
> {<0.5174.0>,#Ref<0.0.0.31231>},
> {unlink_proc,<0.5379.0>}},
> {'$gen_call',
> {<0.5185.0>,#Ref<0.0.0.31237>},
> {unlink_proc,<0.5381.0>}},
> {'$gen_call',
> {<0.5196.0>,#Ref<0.0.0.31243>},
> {unlink_proc,<0.5383.0>}},
> {'$gen_call',
> {<0.5207.0>,#Ref<0.0.0.31249>},
> {unlink_proc,<0.5385.0>}},
> {'$gen_call',
> {<0.5218.0>,#Ref<0.0.0.31255>},
> {unlink_proc,<0.5387.0>}},
> {'$gen_call',
> {<0.5229.0>,#Ref<0.0.0.31261>},
> {unlink_proc,<0.5389.0>}},
> {'$gen_call',
> {<0.5240.0>,#Ref<0.0.0.31267>},
> {unlink_proc,<0.5391.0>}},
> {'$gen_call',
> {<0.5262.0>,#Ref<0.0.0.31273>},
> {unlink_proc,<0.5393.0>}},
> {'$gen_call',
> {<0.5273.0>,#Ref<0.0.0.31299>},
> {unlink_proc,<0.5395.0>}},
> {'$gen_call',
> {<0.5284.0>,#Ref<0.0.0.31305>},
> {unlink_proc,<0.5398.0>}},
> {'$gen_call',
> {<0.5295.0>,#Ref<0.0.0.31311>},
> {unlink_proc,<0.5400.0>}},
> {'$gen_call',
> {<0.5306.0>,#Ref<0.0.0.31317>},
> {unlink_proc,<0.5402.0>}},
> {'$gen_call',
> {<0.5317.0>,#Ref<0.0.0.31323>},
> {unlink_proc,<0.5404.0>}},
> {'$gen_call',
> {<0.5328.0>,#Ref<0.0.0.31329>},
> {unlink_proc,<0.5406.0>}},
> {'$gen_call',
> {<0.5339.0>,#Ref<0.0.0.31359>},
> {unlink_proc,<0.5408.0>}},
> {'EXIT',<0.5408.0>,{exit_status,137}},
> {'DOWN',#Ref<0.0.0.31331>,process,<0.5408.0>,
> {exit_status,137}},
> {'EXIT',<0.5412.0>,normal},
> {'DOWN',#Ref<0.0.0.31360>,process,<0.5412.0>,normal},
> {'DOWN',#Ref<0.0.0.31269>,process,<0.5393.0>,
> shutdown},
> {'DOWN',#Ref<0.0.0.21467>,process,<0.3714.0>,
> shutdown},
> {'DOWN',#Ref<0.0.0.31313>,process,<0.5402.0>,
> shutdown},
> {'DOWN',#Ref<0.0.0.31239>,process,<0.5383.0>,
> shutdown},
> {'DOWN',#Ref<0.0.0.31245>,process,<0.5385.0>,
> {exit_status,137}},
> {'EXIT',<0.5387.0>,{exit_status,137}},
> {'DOWN',#Ref<0.0.0.31251>,process,<0.5387.0>,
> {exit_status,137}},
> {'DOWN',#Ref<0.0.0.31227>,process,<0.5379.0>,
> shutdown},
> {'DOWN',#Ref<0.0.0.31263>,process,<0.5391.0>,
> shutdown},
> {'DOWN',#Ref<0.0.0.31257>,process,<0.5389.0>,
> shutdown},
> {'DOWN',#Ref<0.0.0.31301>,process,<0.5398.0>,
> shutdown},
> {'DOWN',#Ref<0.0.0.31325>,process,<0.5406.0>,
> shutdown},
> {'DOWN',#Ref<0.0.0.31319>,process,<0.5404.0>,
> shutdown},
> {'DOWN',#Ref<0.0.0.31307>,process,<0.5400.0>,
> shutdown},
> {'DOWN',#Ref<0.0.0.31233>,process,<0.5381.0>,
> shutdown},
> {'DOWN',#Ref<0.0.0.31275>,process,<0.5395.0>,
> shutdown}]},
> {links,[<0.94.0>]},
> {dictionary,[]},
> {trap_exit,true},
> {status,running},
> {heap_size,17711},
> {stack_size,24},
> {reductions,7801}],
> []]}}
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira