You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by "Andras Piros (JIRA)" <ji...@apache.org> on 2018/05/23 11:29:00 UTC

[jira] [Assigned] (OOZIE-3254) [coordinator] LAST_ONLY and NONE execution modes: possible OOM when there are too many coordinator actions to materialize

     [ https://issues.apache.org/jira/browse/OOZIE-3254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andras Piros reassigned OOZIE-3254:
-----------------------------------

    Assignee: Andras Piros

> [coordinator] LAST_ONLY and NONE execution modes: possible OOM when there are too many coordinator actions to materialize
> -------------------------------------------------------------------------------------------------------------------------
>
>                 Key: OOZIE-3254
>                 URL: https://issues.apache.org/jira/browse/OOZIE-3254
>             Project: Oozie
>          Issue Type: Bug
>          Components: coordinator
>    Affects Versions: 5.0.0
>            Reporter: Andras Piros
>            Assignee: Andras Piros
>            Priority: Major
>
> If there is a coordinator job defined with a {{frequency}} by the minute (e.g. {{frequency="* * * * *"}}), and {{start-time}} lies well in the past, and the coordinator job's {{execution-mode}} is {{LAST_ONLY}} or {{NONE}}, it can happen that too many {{CoordinatorActionBean}} instances are kept on JVM heap within {{CoordMaterializeTransitionXCommand#insertList}} as those execution modes [*omit the check for the {{throttle}} value*|https://github.com/apache/oozie/blob/master/core/src/main/java/org/apache/oozie/command/coord/CoordMaterializeTransitionXCommand.java#L439-L443].
> As a consequence, we can see as many as multiple hundred thousands of log entries [*trying to increase {{CoordMaterializeTransitionXCommand#insertList}}*|https://github.com/apache/oozie/blob/master/core/src/main/java/org/apache/oozie/command/coord/CoordMaterializeTransitionXCommand.java#L560-L566]:
> {noformat}
> [systest@tmc-4 ~]$ grep 'In storeToDB() coord action id' /var/log/oozie/oozie-cmf-OOZIE-1-OOZIE_SERVER-tmc-4.vpc.cloudera.com.log.out | wc -l
> 478408
> {noformat}
> A much worse consequence is that those {{CoordinatorActionBean}} instances are attached to GC root (the {{insertList}} itself), and thus, JVM is unable to free them until a consequent call to {{insertList.clear()}}. This will result in {{OutOfMemoryError}} occurrence in worst case.
> {{CoordMaterializeTransitionXCommand#insertList}} should be watched for a configurable limit parameter (default value something like 1000), and persisted / cleared when that limit is reached.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)