You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Lakshmi Rao (JIRA)" <ji...@apache.org> on 2018/06/29 20:06:00 UTC

[jira] [Updated] (FLINK-9692) Adapt maxRecords parameter in the getRecords call to optimize bytes read from Kinesis

     [ https://issues.apache.org/jira/browse/FLINK-9692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lakshmi Rao updated FLINK-9692:
-------------------------------
    Description: 
The Kinesis connector currently has a [constant value|https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/internals/ShardConsumer.java#L213] set for maxRecords that it can fetch from a single Kinesis getRecords call. However, in most realtime scenarios, the average size of the Kinesis record (in bytes) changes depending on the situation i.e. you could be in a transient scenario where you are reading large sized records and would hence like to fetch fewer records in each getRecords call (so as to not exceed the 2 Mb/sec [per shard limit|https://docs.aws.amazon.com/kinesis/latest/APIReference/API_GetRecords.html] on the getRecords call). 

The idea here is to adapt the Kinesis connector to identify an average batch size prior to making the getRecords call, so that the maxRecords parameter can be appropriately tuned before making the call. 

This feature can be behind a [ConsumerConfigConstants|https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/config/ConsumerConfigConstants.java] flag that defaults to false. 

 

  was:
The Kinesis connector currently has a [constant value|https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/internals/ShardConsumer.java#L213] set for maxRecords that it can fetch from a single Kinesis getRecords call. However, in most realtime scenarios, the average size of the Kinesis record (in bytes) changes depending on the situation i.e. you could be in a transient scenario where you are reading large sized records and would hence like to fetch fewer records in each getRecords call (so as to not exceed the 2 Mb/sec [per shard limit|https://docs.aws.amazon.com/kinesis/latest/APIReference/API_GetRecords.html] on the getRecords call). 

The idea here is to adapt the Kinesis connector to identify an average batch size prior to making the getRecords call, so that the maxRecords parameter can be appropriately tuned before making the call. 


> Adapt maxRecords parameter in the getRecords call to optimize bytes read from Kinesis 
> --------------------------------------------------------------------------------------
>
>                 Key: FLINK-9692
>                 URL: https://issues.apache.org/jira/browse/FLINK-9692
>             Project: Flink
>          Issue Type: Improvement
>          Components: Kinesis Connector
>    Affects Versions: 1.5.0, 1.4.2
>            Reporter: Lakshmi Rao
>            Priority: Major
>              Labels: performance
>
> The Kinesis connector currently has a [constant value|https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/internals/ShardConsumer.java#L213] set for maxRecords that it can fetch from a single Kinesis getRecords call. However, in most realtime scenarios, the average size of the Kinesis record (in bytes) changes depending on the situation i.e. you could be in a transient scenario where you are reading large sized records and would hence like to fetch fewer records in each getRecords call (so as to not exceed the 2 Mb/sec [per shard limit|https://docs.aws.amazon.com/kinesis/latest/APIReference/API_GetRecords.html] on the getRecords call). 
> The idea here is to adapt the Kinesis connector to identify an average batch size prior to making the getRecords call, so that the maxRecords parameter can be appropriately tuned before making the call. 
> This feature can be behind a [ConsumerConfigConstants|https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/config/ConsumerConfigConstants.java] flag that defaults to false. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)