You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by "Peter Cseh (JIRA)" <ji...@apache.org> on 2018/05/23 12:07:00 UTC
[jira] [Commented] (OOZIE-3254) [coordinator] LAST_ONLY and NONE
execution modes: possible OutOfMemoryError when there are too many
coordinator actions to materialize
[ https://issues.apache.org/jira/browse/OOZIE-3254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16487137#comment-16487137 ]
Peter Cseh commented on OOZIE-3254:
-----------------------------------
This can occur with more realistic frequencies when a coordinator is resubmitted - e.g. after disaster recovery - without changing the start time - as the execution is LAST_ONLY, why should we care about the start time?
> [coordinator] LAST_ONLY and NONE execution modes: possible OutOfMemoryError when there are too many coordinator actions to materialize
> --------------------------------------------------------------------------------------------------------------------------------------
>
> Key: OOZIE-3254
> URL: https://issues.apache.org/jira/browse/OOZIE-3254
> Project: Oozie
> Issue Type: Bug
> Components: coordinator
> Affects Versions: 5.0.0
> Reporter: Andras Piros
> Assignee: Andras Piros
> Priority: Major
>
> If there is a coordinator job defined with a {{frequency}} by the minute (e.g. {{frequency="* * * * *"}}), and {{start-time}} lies well in the past, and the coordinator job's {{execution-mode}} is {{LAST_ONLY}} or {{NONE}}, it can happen that too many {{CoordinatorActionBean}} instances are kept on JVM heap within {{CoordMaterializeTransitionXCommand#insertList}} as those execution modes [*omit the check for the {{throttle}} value*|https://github.com/apache/oozie/blob/master/core/src/main/java/org/apache/oozie/command/coord/CoordMaterializeTransitionXCommand.java#L439-L443].
> As a consequence, we can see as many as multiple hundred thousands of log entries [*trying to increase {{CoordMaterializeTransitionXCommand#insertList}}*|https://github.com/apache/oozie/blob/master/core/src/main/java/org/apache/oozie/command/coord/CoordMaterializeTransitionXCommand.java#L560-L566]:
> {noformat}
> [user@host ~]$ grep 'In storeToDB() coord action id' /var/log/oozie/oozie-HOSTNAME.log.out | wc -l
> 478408
> {noformat}
> A much worse consequence is that those {{CoordinatorActionBean}} instances are attached to GC root (the {{insertList}} itself), and thus, JVM is unable to free them until a consequent call to {{insertList.clear()}}. This will result in {{OutOfMemoryError}} occurrence in worst case.
> {{CoordMaterializeTransitionXCommand#insertList}} should be watched for a configurable limit parameter (default value something like 1000), and persisted / cleared when that limit is reached.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)