You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Thomas Weise (JIRA)" <ji...@apache.org> on 2018/02/08 22:06:00 UTC

[jira] [Commented] (FLINK-5697) Add per-shard watermarks for FlinkKinesisConsumer

    [ https://issues.apache.org/jira/browse/FLINK-5697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16357640#comment-16357640 ] 

Thomas Weise commented on FLINK-5697:
-------------------------------------

Since shards are immutable wrt their hash key range and records cannot move between shards, we should be able to use the parent shard IDs and the last read sequence to find when a newly discovered shard can be read from. Child shards don't need to be assigned to the same subtask, in which case we would need a way to know the last read offset from the parent shard from a different subtask for comparison with EndingSequenceNumber. Is it possible to retrieve the last checkpointed offsets from other subtasks outside of restore to perform such check? (It would still imply that consumption from a new child shard cannot start until the parent was checkpointed and therefore add latency, but would provide the ordering guarantee we are looking for?)

> Add per-shard watermarks for FlinkKinesisConsumer
> -------------------------------------------------
>
>                 Key: FLINK-5697
>                 URL: https://issues.apache.org/jira/browse/FLINK-5697
>             Project: Flink
>          Issue Type: New Feature
>          Components: Kinesis Connector, Streaming Connectors
>            Reporter: Tzu-Li (Gordon) Tai
>            Priority: Major
>
> It would be nice to let the Kinesis consumer be on-par in functionality with the Kafka consumer, since they share very similar abstractions. Per-partition / shard watermarks is something we can add also to the Kinesis consumer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)