You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/06/04 17:21:05 UTC

[GitHub] [beam] damccorm opened a new issue, #20436: Respect timestamp OutputTime Windowing Strategy configuration in Lifted CombineFns.

damccorm opened a new issue, #20436:
URL: https://github.com/apache/beam/issues/20436

   The Go SDK currently retains an arbitrary timestamp per key per bundle when performing a lifted combine. 
   However, depending on the windowing strategy, a prefered time could be specified.
   https://github.com/apache/beam/blob/a5b2046b10bebc59c5bde41d4cb6498058fdada2/model/pipeline/src/main/proto/beam_runner_api.proto#L901
   
   The code in question for the Go SDK:
   https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/runtime/exec/combine.go#L395
   
   At present this implementation is "correct", as the default output time is Unspecified, and there's no user mechanism to configure a windowing strategy to this granularity.
   
   So there are a few parts to this.
   1. Propagate the windowing strategy information to exec.LiftedCombine somehow and implement the correct output. This can be done whether or not 2 is implemented.
   2. Provide a trigger configuration for beam.WindowInto, so this can be configured on the user side. This is significantly more work.
   
   This matters only when using windows that are not the Global Window, and when using a Lifted Combine, which commonly only happens in batch contexts. However, since Beam is a unified model, the windowing features should work correctly in both execution modes of a Go SDK pipeline.
   
   
   
   
   
   Imported from Jira [BEAM-10302](https://issues.apache.org/jira/browse/BEAM-10302). Original Jira may contain additional context.
   Reported by: lostluck.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org