You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by "Lou DeGenaro (JIRA)" <de...@uima.apache.org> on 2014/04/30 18:28:22 UTC

[jira] [Updated] (UIMA-3659) DUCC Job Driver (JD) OOMs when Total number of work items is large

     [ https://issues.apache.org/jira/browse/UIMA-3659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lou DeGenaro updated UIMA-3659:
-------------------------------

    Fix Version/s: 1.1.0-Ducc

> DUCC Job Driver (JD) OOMs when Total number of work items is large
> ------------------------------------------------------------------
>
>                 Key: UIMA-3659
>                 URL: https://issues.apache.org/jira/browse/UIMA-3659
>             Project: UIMA
>          Issue Type: Bug
>          Components: DUCC
>    Affects Versions: 1.0.0-Ducc
>            Reporter: Lou DeGenaro
>            Assignee: Lou DeGenaro
>             Fix For: 1.1.0-Ducc
>
>
> A Job of 300,000+ Total work items failed with Reason Premature after processing 70,000+ of them.
> The Job Driver (JD) maintains a file in the user's log directory named work-item-status.json.gz comprising the information shown by the WebServer on the Work Items tab of the Job Details page.  As each work item is processed, the JD's WorkItemStateManager (WiSm) maintains an in-memory representation for Id, Node, PID,  State, Start and End times.  Periodically, the JD employs the WiSm's export method to re-write the above file.
> Although the amount of information is relatively small per work item, when the number of work items is large the amount of memory consumed is large since these in-memory representations are kept for the lifetime of the Job.
> To alleviate this "designed-in" memory leak, the WiSm should only keep  active work items in-memory.  Terminal work items should be flushed to disk.  This change will affect DUCC components that employ WiSm, specifically CLI, WS and JD.



--
This message was sent by Atlassian JIRA
(v6.2#6252)