You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by "Jim Challenger (JIRA)" <de...@uima.apache.org> on 2013/07/16 22:52:49 UTC

[jira] [Resolved] (UIMA-2593) RM: Resource Manager mishandling dead node with Work Items in Limbo

     [ https://issues.apache.org/jira/browse/UIMA-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jim Challenger resolved UIMA-2593.
----------------------------------

    Resolution: Fixed
    
> RM: Resource Manager mishandling dead node with Work Items in Limbo
> -------------------------------------------------------------------
>
>                 Key: UIMA-2593
>                 URL: https://issues.apache.org/jira/browse/UIMA-2593
>             Project: UIMA
>          Issue Type: Bug
>          Components: DUCC
>            Reporter: Jim Challenger
>            Assignee: Jim Challenger
>             Fix For: 1.0-Ducc
>
>
> If a node dies with a work-item that is starting but not confirmed so it goes into Limbo, RM continuously allocates a new node until the pool is exhausted.
> Correct behavior is for RM to allocate only sufficient nodes to make up for the dead one, based on remaining work.
> To reproduce, start a small cluster and fire off a job with a couple hundred short (5-10 second) work items.  Once all nodes are full issue SIGSTOP to one agent and JP.  This should cause at least one WI to go into limbo.  When the heartbeat counter says the node is dead we expect to see the errant behavior start.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira