You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/05/11 14:09:00 UTC

[jira] [Commented] (FLINK-8944) Use ListShards for shard discovery in the flink kinesis connector

    [ https://issues.apache.org/jira/browse/FLINK-8944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16471994#comment-16471994 ] 

ASF GitHub Bot commented on FLINK-8944:
---------------------------------------

GitHub user kailashhd opened a pull request:

    https://github.com/apache/flink/pull/5992

    [FLINK-8944] [Kinesis Connector] Use listShards instead of DescribeSt…

    …ream for shard discovery as it offer higher rate limits
    
    ## What is the purpose of the change
    
    List Shards provides high AWS rate limits unlike DescribeStreams (which is on AWS account level) allowing faster shard discovery when kinesis data source in case streams are changed(re-sharded)
    
    ## Brief change log
     - Change the kinesis connector to use listShards instead of DescribeStream for shard discovery.
    
    ## Verifying this change
    
    This change added tests and can be verified as follows:
     - Added a unit test to check the code path mocking out the kinesis depenedencies
     - Tested by running a small flink job with this connector.
    
    ## Does this pull request potentially affect one of the following parts:
    
      - Dependencies (does it add or upgrade a dependency): (yes / *no*)
      - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / *no*)
      - The serializers: (yes / *no* / don't know)
      - The runtime per-record code paths (performance sensitive): (yes / *no* / don't know)
      - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / *no* / don't know)
      - The S3 file system connector: (yes / *no* / don't know)
    
    ## Documentation
    
      - Does this pull request introduce a new feature? (yes / *no*)
      - If yes, how is the feature documented? (*not applicable* / docs / JavaDocs / not documented)


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/kailashhd/flink KinesisProxy

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/5992.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #5992
    
----
commit 3188f24b13c9009e977b6fb25da4d40c93fc811e
Author: Kailash HD <kd...@...>
Date:   2018-03-26T16:42:25Z

    [FLINK-8944] [Kinesis Connector] Use listShards instead of DescribeStream for shard discovery as it offer higher rate limits

----


> Use ListShards for shard discovery in the flink kinesis connector
> -----------------------------------------------------------------
>
>                 Key: FLINK-8944
>                 URL: https://issues.apache.org/jira/browse/FLINK-8944
>             Project: Flink
>          Issue Type: Improvement
>            Reporter: Kailash Hassan Dayanand
>            Priority: Minor
>
> Currently the DescribeStream AWS API used to get list of shards is has a restricted rate limits on AWS. (5 requests per sec per account). This is problematic when running multiple flink jobs all on same account since each subtasks calls the Describe Stream. Changing this to ListShards will provide more flexibility on rate limits as ListShards has a 100 requests per second per data stream limits.
> More details on the mailing list. https://goo.gl/mRXjKh



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)