You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user-zh@flink.apache.org by wang edmond <ed...@hotmail.com> on 2021/11/04 05:33:03 UTC

回复: 关于窗口计算,数据不连续,导致窗口延迟触发的问题。

你好:

可以设置水位线的生成的空闲时间,超过空闲时间没有事件也会生成水位线。

可以参见官方文档中的 Dealing With Idle Sourcces部分

https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/datastream/event-time/generating_watermarks/


Generating Watermarks | Apache Flink<https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/datastream/event-time/generating_watermarks/>
Generating Watermarks # In this section you will learn about the APIs that Flink provides for working with event time timestamps and watermarks. For an introduction to event time, processing time, and ingestion time, please refer to the introduction to event time. Introduction to Watermark Strategies # In order to work with event time, Flink needs to know the events timestamps, meaning each ...
nightlies.apache.org


________________________________
发件人: Gen Luo <lu...@gmail.com>
发送时间: 2021年11月3日 15:14
收件人: user-zh@flink.apache.org <us...@flink.apache.org>
主题: Re: 关于窗口计算,数据不连续,导致窗口延迟触发的问题。

WatermarkGenerator接口有onEvent和onPeriodicEmit, onPeriodicEmit
会周期性调用,可能可以在这里实现一个多长时间没有调onEvent就发一个计算出来的新的watermark的逻辑,新的watermark比当前的watermark对应的窗口时间都更晚应该就能触发所有窗口了

On Mon, Nov 1, 2021 at 5:20 PM yuankuo.xia <en...@vip.qq.com>
wrote:

> hi
>
>
> 背景:我在使用eventTime窗口进行聚合计算,但是数据不连续,比如:A,B时间段之内都有数据流入,但A时间段和B时间段中间有30分钟无数据流入
>
>
> 问题:由于数据不连续,导致A时间段的最后一个窗口不会触发,一直等到新数据流入才能触发。
>
>
> 是否有方案解决以上问题,比如:一段时间无数据流入,则触发所有窗口。我看了trigger接口,但是没有想到好的实现方案。