You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Rong Rong <wa...@gmail.com> on 2019/05/01 04:06:03 UTC

Re: The contradiction between event time and natural time from EventTimeTrigger

Hi Zhipeng,

Please see my explanation below:

From the default EventTimeTrigger source code, I found that only onElement
> method (will judge the watermark) and onEventTime method only have a chance
> to trigger TriggerResult.FIRE;
> Therefore, the default EventTimeTrigger is assumed and must be "never
> stop! The data stream" will have "correct" Real-time results, so as long as
> the interval between the two eventtimes is too large, greater than the time
> window interval, or the end time of the window has not arrived yet, there
> is no new data (flow interruption, neither element(onElement) nor
> eventtime(onEventtime)), then the latest time the output of the window must
> be untimely or non-real-time (if i use eventtime to do the bounded window
> aggregation of the stream, i must have the near future data support, once
> it is interrupted, it will not be real-time), and i must wait until the new
> data stream is connected.


This is yes and no:
1. If there's no element within a specific window at all, the window will
not be created and will not have anything to fire.
2. Upon the first element arrive at a specific window (assume no
late-arrival), the window will be created as well as an event-time timer.
so, if there's no future element arrival, the existing window (with at
least one element) will still fire promptly.
3. However, since this is an event time trigger, in order for the internal
timer service to "activate" the registered timer, watermark has to advance.

Regarding the #3 point I mentioned: I could've been wrong on this, but if
the source function does not advance watermark at all unless an element is
received from external data source, then yes this will probably be stuck.


1. use processing-time
> 2. "guaranteed" stream data event time interval is small and best
> sequential and never interrupted (if this can be guaranteed, use
> processing-time directly. What is the meaning of using EventTime and
> Watermark in the production environment and how to  test the real-time and
> accuracy of the data results? I am sorry I have confused from some flink
> streaming sql examples about the time window.)
> 3. Add new implementations or improvements:  the end time of window
> determined by assignWindows can trigger TriggerResult as soon as it reaches
> the natural time point or reaches the natural time point plus to watermark
> interval.


Regarding:
#1: The problem you described is not with processing-time because there's
nothing preventing the internal-timer to advance on the processing time
trigger/timer - they use the system time which will always advance.
#2/#3: This is not needed, as long as you guarantee watermark advance
promptly.

I am not exactly sure my explanation is the most accurate one, so if anyone
could share more insight please kindly share your thoughts :-)

Thanks,
Rong




On Sat, Apr 27, 2019 at 10:19 PM 邵志鹏 <bo...@163.com> wrote:

> Dear flinker:
>
>
> Look at the contradiction between event time and natural time from
> EventTimeTrigger.java (the window at the time of the break and the end of
> the window at the end of the end must not be "real time"):
>
>
> From the default EventTimeTrigger source code, I found that only onElement
> method (will judge the watermark) and onEventTime method only have a chance
> to trigger TriggerResult.FIRE;
>
>
> Therefore, the default EventTimeTrigger is assumed and must be "never
> stop! The data stream" will have "correct" Real-time results, so as long as
> the interval between the two eventtimes is too large, greater than the time
> window interval, or the end time of the window has not arrived yet, there
> is no new data (flow interruption, neither element(onElement) nor
> eventtime(onEventtime)), then the latest time the output of the window must
> be untimely or non-real-time (if i use eventtime to do the bounded window
> aggregation of the stream, i must have the near future data support, once
> it is interrupted, it will not be real-time), and i must wait until the new
> data stream is connected.
>
>
> The window result that was not output in time before the new data stream
> is come.
>  (OnProcessingTime will never be called after EventTime is set, so
> modifying onProcessingTime has no effect.
> Called when a processing-time timer that was set using the trigger context
> fires.).
>
>
> Then, to "real time TriggerResult" can only
> 1. use processing-time
> 2. "guaranteed" stream data event time interval is small and best
> sequential and never interrupted (if this can be guaranteed, use
> processing-time directly. What is the meaning of using EventTime and
> Watermark in the production environment and how to  test the real-time and
> accuracy of the data results? I am sorry I have confused from some flink
> streaming sql examples about the time window.)
> 3. Add new implementations or improvements:  the end time of window
> determined by assignWindows can trigger TriggerResult as soon as it reaches
> the natural time point or reaches the natural time point plus to watermark
> interval.
>
>
> Real-time results [unrelated to the specific Tumble Hop Session], while to
> the time of TriggerResult.FIRE...
> The watermark has increased, but there is no data, and then the window
> trigger has stopped...
>
>
> I don't know if my understanding is correct, I also hope to give pointers.
>
>
> Thanks.
>
>
>
>