You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/06/03 16:24:21 UTC

[GitHub] [beam] kennknowles opened a new issue, #18145: OutputTimeFn and Accumulating Mode is Confusing

kennknowles opened a new issue, #18145:
URL: https://github.com/apache/beam/issues/18145

   See [here]( https://github.com/tgroh/beam/commit/2238df334a368ce1a41e14ee616be954c5430c73) for an example pipeline
   
   The Timestamp used by a pane does not change based on the accumulation mode of the windowing strategy - as a result, elements which have associated timestamps can not be safely reassigned to those timestamps after a GroupByKey if more than one pane could have been produced, regardless of the `OutputTimeFn`. The first example pipeline demonstrates two PCollections where the elements within the last PCollection cannot be reassigned to their timestamps, even though we are using `OutputTimeFn#outputAtEarliestInputTimestamp` and 
   
   When using a more complex windowing strategy like sessions, this is even more confusing - a session that spans more than one of the downstream windows but that is produced in multiple panes will over time be assigned to later and later windows as more panes are produced - thus, a pipeline that produces session windows and wishes to group the sessions by the point at which they started must only ever produce a single pane per session.
   
   Imported from Jira [BEAM-1372](https://issues.apache.org/jira/browse/BEAM-1372). Original Jira may contain additional context.
   Reported by: tgroh.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org