You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2021/10/06 17:11:41 UTC

[GitHub] [beam] lukecwik commented on pull request #15540: [BEAM-12931] Allow for DoFn#getAllowedTimestampSkew() when checking the output timestamp

lukecwik commented on pull request #15540:
URL: https://github.com/apache/beam/pull/15540#issuecomment-936715458


   > > Won't this allow for infinite skew since if have a timer at `X` and skew of `-1` then the first time the timer is processed you can output at time `X-1` and when it gets scheduled again you can now output at `X-2` since the the new timers timestamp is `X-1`?
   > 
   > So my understanding of the reason for these checks is to stop people from doing the wrong thing without realizing it. We don't even take any different action based on this variable. It seems okay to apply this to each specific output timestamp and let you skew more if you chain timers in this fashion.
   > 
   > On a more practical note, there's reasons why you might want a timer to output an earlier element if you've properly set up watermark holds. There's currently no way to do that so we need some allowance. It would probably be better if we could constrain skew from the first output timestamp but I don't think that's available in the later timers, right?
   > 
   > If you disagree with the approach, I can bring this up on the email thread for others to chime in in case they are not checking here.
   
   I think users will be surprised that their data will be dropped as late once they pass the watermark skew bound if they output past it. The existing logic had guards for this explicitly since it would be surprising for users so I do believe it is important enough to discuss whether there is another approach to solve this or we are ok with this happening.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org