You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Kenneth Knowles (Jira)" <ji...@apache.org> on 2020/10/07 09:28:00 UTC

[jira] [Updated] (BEAM-11034) State garbage collection timers set by Dataflow SimpleParDoFn pile up for the GlobalWindow

     [ https://issues.apache.org/jira/browse/BEAM-11034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kenneth Knowles updated BEAM-11034:
-----------------------------------
    Status: Open  (was: Triage Needed)

> State garbage collection timers set by Dataflow SimpleParDoFn pile up for the GlobalWindow
> ------------------------------------------------------------------------------------------
>
>                 Key: BEAM-11034
>                 URL: https://issues.apache.org/jira/browse/BEAM-11034
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-dataflow
>            Reporter: Sam Whittle
>            Assignee: Sam Whittle
>            Priority: P2
>
> If the dofn is stateful, garbage collection timers are set for the end of the window plus allowed lateness:
> https://github.com/apache/beam/blob/6fdde4f4eab72b49b10a8bb1cb3be263c5c416b5/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/SimpleParDoFn.java#L491
> For the global window this ends up setting garbage collection timers that will only fire once the pipeline is drained.  For pipelines that have constantly newly arriving unique stateful keys, and otherwise cleanup their state appropriately when triggering occurs, the # of timers builds up over time.
> Example window and trigger, where the user has the opportunity to clean up state for the key after at most a minute.  However they have no control over the timer set.
> GlobalWindows()
> .triggering(Repeatedly.forever(AfterFirst.of(
> AfterPane.elementCountAtLeast(5000),
> AfterProcessingTime.pastFirstElementInPane().plusDelayOf(Duration.standardMi
> nutes(1))).discardingFiredPanes().withAllowedLateness(Duration.ZERO);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)