You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Tzu-Li (Gordon) Tai (JIRA)" <ji...@apache.org> on 2016/06/04 12:37:59 UTC

[jira] [Updated] (FLINK-4020) Remove shard list querying from Kinesis consumer constructor

     [ https://issues.apache.org/jira/browse/FLINK-4020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tzu-Li (Gordon) Tai updated FLINK-4020:
---------------------------------------
    Description: 
Currently FlinkKinesisConsumer is querying for the whole list of shards in the constructor, forcing the client to be able to access Kinesis as well. This is also a drawback for handling Kinesis-side resharding, since we'd want all shard listing / shard-to-task assigning / shard end (result of resharding) handling logic to be capable of being independently done within task life cycle methods, with defined and definite results.

Main thing to overcome is coordination between parallel subtasks. All subtasks will need to retry (due to Amazon's operation rate limits) until all subtasks have succeeded. We could probably use either ZK or Amazon DynamoDB (user configurable) for coordinating subtask status.

  was:Currently FlinkKinesisConsumer is querying for the whole list of shards in the constructor, forcing the client to be able to access Kinesis as well. This is also a drawback for handling Kinesis-side resharding, since we'd want all shard listing / shard-to-task assigning / shard end (result of resharding) handling logic to be capable of being independently done within task life cycle methods, with defined and definite results.


> Remove shard list querying from Kinesis consumer constructor
> ------------------------------------------------------------
>
>                 Key: FLINK-4020
>                 URL: https://issues.apache.org/jira/browse/FLINK-4020
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Streaming Connectors
>            Reporter: Tzu-Li (Gordon) Tai
>
> Currently FlinkKinesisConsumer is querying for the whole list of shards in the constructor, forcing the client to be able to access Kinesis as well. This is also a drawback for handling Kinesis-side resharding, since we'd want all shard listing / shard-to-task assigning / shard end (result of resharding) handling logic to be capable of being independently done within task life cycle methods, with defined and definite results.
> Main thing to overcome is coordination between parallel subtasks. All subtasks will need to retry (due to Amazon's operation rate limits) until all subtasks have succeeded. We could probably use either ZK or Amazon DynamoDB (user configurable) for coordinating subtask status.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)