You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Lukasz Cwik <lc...@google.com> on 2017/12/20 22:08:17 UTC

Re: FlinkRunner restore from save point, CoGroupByKey holds onto state

+dev@beam.apache.org

The location of the GroupByKey/CoGroupByKey is where all windowing
information is being buffered before being fired by a trigger.
The windowing strategy is:
        Window<KV<String, String>> window = Window.<KV<String,
String>>into(FixedWindows.of(Duration.standardSeconds(10)))
            .withAllowedLateness(Duration.standardSeconds(5))
            .accumulatingFiredPanes()
            .triggering(Never.ever());

Is there an issue preventing proper GC?

On Tue, Dec 19, 2017 at 8:53 AM, Seth Albanese <
seth.albanese@accretivetg.com> wrote:

> I’m running Beam 2.2.0 on Flink 1.3 using KafkaIO. Reading from two
> topics, applying a fixed window, joining via a CoGrouByKey, and outputting
> to another topic.
>
> Example code that reproduces the issue can be seen here:
> https://gist.github.com/salbanese/c46df2718c09a897e04d498c3f59d9d7
>
> When I cancel the job with a save point via flink cancel –s
> /path/to/savepoint job-id, then restart the job from that save point via
> flink run –s /path/to/savepoint –c … each subsequent save point grows in
> size by about 30 percent or so, and eventually flink starts timing out and
> fails to cancel the job. Eliminating the CoGroupByKey seems to stop this
> behavior, and save points are consistent from one run to the next.
>
> I feel like I must be missing something.  Any advice would be appreciated.
>
> Thanks
> -seth
>