You are viewing a plain text version of this content. The canonical link for it is here.
Posted to by Lukasz Cwik <> on 2017/12/20 22:08:17 UTC

Re: FlinkRunner restore from save point, CoGroupByKey holds onto state

The location of the GroupByKey/CoGroupByKey is where all windowing
information is being buffered before being fired by a trigger.
The windowing strategy is:
        Window<KV<String, String>> window = Window.<KV<String,

Is there an issue preventing proper GC?

On Tue, Dec 19, 2017 at 8:53 AM, Seth Albanese <> wrote:

> I’m running Beam 2.2.0 on Flink 1.3 using KafkaIO. Reading from two
> topics, applying a fixed window, joining via a CoGrouByKey, and outputting
> to another topic.
> Example code that reproduces the issue can be seen here:
> When I cancel the job with a save point via flink cancel –s
> /path/to/savepoint job-id, then restart the job from that save point via
> flink run –s /path/to/savepoint –c … each subsequent save point grows in
> size by about 30 percent or so, and eventually flink starts timing out and
> fails to cancel the job. Eliminating the CoGroupByKey seems to stop this
> behavior, and save points are consistent from one run to the next.
> I feel like I must be missing something.  Any advice would be appreciated.
> Thanks
> -seth