You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Qi Kang <mi...@126.com> on 2019/11/01 09:37:14 UTC

How to emit changed data only w/ Flink trigger?

Hi all,


We have a Flink job which aggregates sales volume and GMV data of each site on a daily basis. The code skeleton is shown as follows.


```
sourceStream
 .map(message -> JSON.parseObject(message, OrderDetail.class))
 .keyby("siteId")
 .window(TumblingProcessingTimeWindows.of(Time.days(1), Time.hours(-8)))
 .trigger(ContinuousProcessingTimeTrigger.of(Time.seconds(1)))
 .aggregate(new VolumeGmvAggregateFunc());
```


The window is triggered every second in order to refresh the data displayed on a real-time dashboard. Is there some way to output only those sites’ data which changed in 1 second period? Currently we’ve got 1000+ sites, so frequently emitting all aggregation records seems somewhat expensive.


BR, Qi Kang



Re: How to emit changed data only w/ Flink trigger?

Posted by kant kodali <ka...@gmail.com>.
I am new to Flink so I am not sure if I am giving you the correct answer so
you might want to wait for others to respond. But I think you should do

.inUpsertMode()


On Fri, Nov 1, 2019 at 2:38 AM Qi Kang <mi...@126.com> wrote:

> Hi all,
>
>
> We have a Flink job which aggregates sales volume and GMV data of each
> site on a daily basis. The code skeleton is shown as follows.
>
>
> ```
> sourceStream
>  .map(message -> JSON.parseObject(message, OrderDetail.class))
>  .keyby("siteId")
>  .window(TumblingProcessingTimeWindows.of(Time.days(1), Time.hours(-8)))
>  .trigger(ContinuousProcessingTimeTrigger.of(Time.seconds(1)))
>  .aggregate(new VolumeGmvAggregateFunc());
> ```
>
>
> The window is triggered every second in order to refresh the data
> displayed on a real-time dashboard. Is there some way to output only those
> sites’ data which changed in 1 second period? Currently we’ve got 1000+
> sites, so frequently emitting all aggregation records seems somewhat
> expensive.
>
>
> BR, Qi Kang
>
>
>

Re: How to emit changed data only w/ Flink trigger?

Posted by Taher Koitawala <ta...@gmail.com>.
You can do this by writing a custom trigger or evictor.

On Fri, Nov 1, 2019 at 3:08 PM Qi Kang <mi...@126.com> wrote:

> Hi all,
>
>
> We have a Flink job which aggregates sales volume and GMV data of each
> site on a daily basis. The code skeleton is shown as follows.
>
>
> ```
> sourceStream
>  .map(message -> JSON.parseObject(message, OrderDetail.class))
>  .keyby("siteId")
>  .window(TumblingProcessingTimeWindows.of(Time.days(1), Time.hours(-8)))
>  .trigger(ContinuousProcessingTimeTrigger.of(Time.seconds(1)))
>  .aggregate(new VolumeGmvAggregateFunc());
> ```
>
>
> The window is triggered every second in order to refresh the data
> displayed on a real-time dashboard. Is there some way to output only those
> sites’ data which changed in 1 second period? Currently we’ve got 1000+
> sites, so frequently emitting all aggregation records seems somewhat
> expensive.
>
>
> BR, Qi Kang
>
>
>