You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Lars Skjærven <la...@gmail.com> on 2022/01/21 12:00:52 UTC

Window function - flush on job stop

We're doing a stream.keyBy().window().aggregate() to aggregate customer
feedback into sessions. Every now and then we have to update the job, e.g.
change the key, so that we can't easlily continue from the previous state.

Cancelling the job (without restarting from last savepoint) will result in
loosing ongoing sessions. So we typically go back a few hours when we
restart to minimize the loss.

Is there any way of making the job flush it's content (sessions) on job
cancellation? That will result in splitting ongoing sessions in two, which
is perfectly fine for our purpose.

Any thoughts ?

Lars

Re: Window function - flush on job stop

Posted by Caizhi Weng <ts...@gmail.com>.
Hi!

As far as I know there is currently no way to do this. However if you'd
like to, you can implement this with a custom source. Before you stop the
job you need to send a signal to this custom source (for example through a
common file on HDFS or just through socket) and if the custom source
detects this, it sends out a record with a very large watermark to cut off
the session.

Lars Skjærven <la...@gmail.com> 于2022年1月21日周五 20:01写道:

> We're doing a stream.keyBy().window().aggregate() to aggregate customer
> feedback into sessions. Every now and then we have to update the job, e.g.
> change the key, so that we can't easlily continue from the previous state.
>
> Cancelling the job (without restarting from last savepoint) will result in
> loosing ongoing sessions. So we typically go back a few hours when we
> restart to minimize the loss.
>
> Is there any way of making the job flush it's content (sessions) on job
> cancellation? That will result in splitting ongoing sessions in two, which
> is perfectly fine for our purpose.
>
> Any thoughts ?
>
> Lars
>
>
>