You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2018/11/01 22:04:07 UTC

[GitHub] EronWright commented on issue #6980: [FLINK-5697] [kinesis] Add periodic per-shard watermark support

EronWright commented on issue #6980: [FLINK-5697] [kinesis] Add periodic per-shard watermark support
URL: https://github.com/apache/flink/pull/6980#issuecomment-435202809
 
 
   There is a caveat with this implementation that the docs should perhaps mention.  The caveat is that it may produce spurious late events when processing a backlog of data.
   
   Here's an example of when that may occur.  Imagine that subtask 1 is processing shard A and subtask 2 is processing shard B.  Shard A has reached 6:00 in event time (as per the assigner), and so the subtask emits the corresponding watermark.  At this point, the subtask has made the irrevocable assertion that subsequent events will be past 6:00.   Meanwhile, Shard B is at 5:30 and undergoes a split into C/D.  If either shard is subsequently assigned to subtask 1, the events will be considered late due to the assertion made earlier.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services