You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by "ASF subversion and git services (JIRA)" <ji...@apache.org> on 2015/01/21 20:22:36 UTC

[jira] [Commented] (COUCHDB-2426) Timeouts in couch_os_process:start_link/1 leave dangling couch_query_servers

    [ https://issues.apache.org/jira/browse/COUCHDB-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14286106#comment-14286106 ] 

ASF subversion and git services commented on COUCHDB-2426:
----------------------------------------------------------

Commit f9d0785c93a6523e187ec4bcc8e487def5db8bd5 in couchdb-couch's branch refs/heads/master from [~mikewallace]
[ https://git-wip-us.apache.org/repos/asf?p=couchdb-couch.git;h=f9d0785 ]

Add a configurable timeout for get_proc calls

Previously the gen_server calls to couch_proc_manager/get_proc
used a timeout of infinity. There are multiple places in the
couch_proc_manager code path where that process can die without
replying. With an infinity timeout the couch_query_server process
would then hang around forever.

This commit makes the gen_server call to get_proc use the value
of couchdb/os_process_timeout as a timeout.

Closes:

  COUCHDB-2425
  COUCHDB-2426

This closes #31

Signed-off-by: Alexander Shorin <kx...@apache.org>


> Timeouts in couch_os_process:start_link/1 leave dangling couch_query_servers
> ----------------------------------------------------------------------------
>
>                 Key: COUCHDB-2426
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-2426
>             Project: CouchDB
>          Issue Type: Bug
>      Security Level: public(Regular issues) 
>          Components: JavaScript View Server
>    Affects Versions: 2.0.0
>            Reporter: Mike Wallace
>             Fix For: 2.0.0
>
>
> Similar to COUCHDB-2425, if the couch_os_process:start_link/1 call made in couch_proc_manager:new_proc_int/1 times out [1] then a reply is not returned to the calling process. When the caller is couch_query_servers:get_os_process/1 [4] there is an infinite timeout so that process will hang around until either the node reboots or someone intervenes.
> The user-facing symptom is an entry in _active_tasks that makes no progress and never goes away.
> The easiest way to reproduce this is to create a new view and then make couch_os_process time out by patching an unreasonable timeout, e.g.: (assuming a live dev/run instance): 
>  1. Create DB, add a ddoc and a doc:
> {code}
> $ curl -X PUT http://localhost:15984/kitteh
> {"ok":true}
> $ curl -X POST http://localhost:15984/kitteh -d '{"_id":"_design/view", "views": {"test": {"map": "function(doc) { emit(doc.id, 1); }"}}}' -H 'Content-Type: application/json'
> {"ok":true,"id":"_design/view","rev":"1-ef10f980522d4c8e691e3c26d4c3fac5"}
> $ curl -X PUT http://localhost:15984/kitteh/ohai -d '{}'
> {"ok":true,"id":"ohai","rev":"1-967a00dff5e02add41819138abb3284d"}
> {code}
>  2. Apply https://gist.github.com/mikewallace1979/aca1bc331e5eed693357 then re-run make and dev/run.
>  3. Attempt to query the view:
> {code}
> $ curl -X GET http://localhost:15984/kitteh/_design/view/_view/test
> {"error":"timeout","reason":"The request could not be processed in a reasonable amount of time."}
> {code}
>  4. Observe index build tasks which never go away:
> {code}
> $ curl -X GET http://localhost:15986/_active_tasks
> [{"pid":"<0.2527.0>","changes_done":0,"database":"shards/40000000-5fffffff/kitteh.1414783387","design_document":"_design/view","progress":0,"started_on":1414783454,"total_changes":1,"type":"indexer","updated_on":1414783454},{"pid":"<0.2553.0>","changes_done":0,"database":"shards/00000000-1fffffff/kitteh.1414783387","design_document":"_design/view","progress":0,"started_on":1414783454,"total_changes":1,"type":"indexer","updated_on":1414783454}]
> {code}
> [1] https://github.com/apache/couchdb-couch/blob/master/src/couch_proc_manager.erl#L401



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)