You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by "Lou DeGenaro (JIRA)" <de...@uima.apache.org> on 2014/03/04 21:54:13 UTC

[jira] [Commented] (UIMA-3659) DUCC Job Driver (JD) OOMs when Total number of work items is large

    [ https://issues.apache.org/jira/browse/UIMA-3659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13920001#comment-13920001 ] 

Lou DeGenaro commented on UIMA-3659:
------------------------------------

We remain backwards compatible for display of previous work, but remove the need for in-memory image of all WIs for newly launched JDs.

[Transport]

1. DuccWorkJob gets new fields for work items: version, millis Max, Min, Avg, OperatingLeast and CompletedMost with getters & setters
2. Ditto for DriverStatusReport
3. We tag newly created DuccWorkJobs with wiVersion 1, while historical Jobs will be wiVersion 0

[Common]

4. New classes WorkItemStateAbstract, Keeper, Reader and Statistics to manage new files comprising active work, completed work and finally zipped work.  As such, the WS (employing Reader) will be able to display both active and completed work while the Job is in progress, and once the Job is finished JD will efficiently condense into a zip file which the WS will also be able to display.  The JD (employing Keeper) will no longer keep an in-memory representation of all work items, only those currently active.
5. For backward compatibility, we retain classes WorkItemStateJson, JsonGz, Manager and SerializedObjects.
6. We tweak DuccLogger to provide a method to fetch the component of a logger, employed by the newly added common code so as to be able to create additional loggers for the same component for debugging purposes

[CLI]

7. We update DuccPerfStats to employ WorkItemStateReader for either legacy (v=0) or modern (v=1) as a parameter.  When legacy, the deprecated WorkItemStateManager is used.

[OR]

8. StateManager transfers the new millis info from DriverStateReport to DuccWorkJob upon receipt of publication from JD

[JD]

9. Employs WorkItemStateKeeper to manage two WI state information files for active (replaced from memory) and completed (appended) work items and a final zip file replacing them both upon Job completion.
10. Publishes as part of DriverStatusReport new WI millis statistics

[WS]

11. Delete CacheManager, which maintained and cached copy of recently view Job WorkItems state sets
12. Delete WorkItemStateHelper, which provided calculations for LeastOperating and MostCompleted millis for making a time-to-completion projection
13. Employ new LeastOperating and MostCompleted millis now contained in DuccWorkJob
14. Similar to CLI change, employ WorkItemStateReader for either legacy (v=0) or modern (v=1) as a parameter.
15. Employ mills Min/Max/Avg in DuccWorkJob for relevant part of Performance page display
16. Max WI's displayed is 4096

> DUCC Job Driver (JD) OOMs when Total number of work items is large
> ------------------------------------------------------------------
>
>                 Key: UIMA-3659
>                 URL: https://issues.apache.org/jira/browse/UIMA-3659
>             Project: UIMA
>          Issue Type: Bug
>          Components: DUCC
>    Affects Versions: 1.0-Ducc
>            Reporter: Lou DeGenaro
>            Assignee: Lou DeGenaro
>
> A Job of 300,000+ Total work items failed with Reason Premature after processing 70,000+ of them.
> The Job Driver (JD) maintains a file in the user's log directory named work-item-status.json.gz comprising the information shown by the WebServer on the Work Items tab of the Job Details page.  As each work item is processed, the JD's WorkItemStateManager (WiSm) maintains an in-memory representation for Id, Node, PID,  State, Start and End times.  Periodically, the JD employs the WiSm's export method to re-write the above file.
> Although the amount of information is relatively small per work item, when the number of work items is large the amount of memory consumed is large since these in-memory representations are kept for the lifetime of the Job.
> To alleviate this "designed-in" memory leak, the WiSm should only keep  active work items in-memory.  Terminal work items should be flushed to disk.  This change will affect DUCC components that employ WiSm, specifically CLI, WS and JD.



--
This message was sent by Atlassian JIRA
(v6.2#6252)