You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "Kenneth Knowles (JIRA)" <ji...@apache.org> on 2017/06/28 19:57:00 UTC

[jira] [Created] (BEAM-2535) Allow explicit output time independent of firing specification for all timers

Kenneth Knowles created BEAM-2535:
-------------------------------------

             Summary: Allow explicit output time independent of firing specification for all timers
                 Key: BEAM-2535
                 URL: https://issues.apache.org/jira/browse/BEAM-2535
             Project: Beam
          Issue Type: New Feature
          Components: beam-model, sdk-java-core
            Reporter: Kenneth Knowles
            Assignee: Kenneth Knowles


Today, we have insufficient control over the event time timestamp of elements output from a timer callback.

1. For an event time timer, it is the timestamp of the timer itself.
2. For a processing time timer, it is the current input watermark at the time of processing.

But for both of these, we may want to reserve the right to output a particular time, aka set a "watermark hold".

A naive implementation of a {{TimerWithWatermarkHold}} would work for making sure output is not droppable, but does not fully explain window expiration and late data/timer dropping.

In the natural interpretation of a timer as a feedback loop on a transform, timers should be viewed as another channel of input, with a watermark, and items on that channel _all need event time timestamps even if they are delivered according to a different time domain_.

I propose that the specification for when a timer should fire should be separated (with nice defaults) from the specification of the event time of resulting outputs. These timestamps will determine a side channel with a new "timer watermark" that constrains the output watermark.

 - We still need to fire event time timers according to the input watermark, so that event time timers fire.
 - Late data dropping and window expiration will be in terms of the minimum of the input watermark and the timer watermark. In this way, whenever a timer is set, the window is not going to be garbage collected.
 - We will need to make sure we have a way to "wake up" a window once it is expired; this may be as simple as exhausting the timer channel as soon as the input watermark indicates expiration of a window



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)