You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Aqib Mehmood <aq...@retailo.co> on 2022/05/25 09:36:54 UTC
Fwd: Flink Kinesis Connector GetRecords Frequency
Dear Support Team,
We were setting up a Kinesis-Flink pipeline on our cluster and we seem to
be running into an issue.
It seems that a Kinesis Data Stream GetRecords limit is 5 requests/second
and we were wondering if Flink surpasses this limit by default.
Is there any metric that we can use to figure out how many API calls Flink
is making under the hood to the Kinesis Data Stream per second to get data?
Thank you in advance.
Regards
Aqib Mehmood
Data Engineer
Retailo
Re: Flink Kinesis Connector GetRecords Frequency
Posted by "Teoh, Hong" <li...@amazon.co.uk.INVALID>.
Hi Aqib,
To add to Ahmed's response:
* The specific GetRecords Configurations you can control for the KinesisConsumer can be seen here<https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/config/ConsumerConfigConstants.java#L379-L389>.
* You can see the default interval is set such that it should not exceed 5 TPS per shard
* Do you have any other consumers on the same stream? They might contribute to limit instead of Flink. If so, you can consider using the Elastic Fan Out Consumer
Is there any metric that we can use to figure out how many API calls Flink
is making under the hood to the Kinesis Data Stream per second to get data?
No, Flink consumer does not have any specific metrics, but you can probably use CloudTrail to Query who is sending the GetRecords API request on your Kinesis Stream
https://docs.aws.amazon.com/athena/latest/ug/cloudtrail-logs.html
Regards,
Hong
On 25/05/2022, 11:30, "Ahmed Hamdy" <ha...@gmail.com> wrote:
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
Hi Aqib
Flink uses FlinkKinesisConsumer as the source connector to read from
kinesis sources.
You can refer to this part of the documentation
<https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/connectors/datastream/kinesis/#polling-default-record-publisher-1>on
how to configure FlinkKinesisConsumer for a certain GetRecords limit.
Regards
Ahmed Hamdy
On Wed, 25 May 2022 at 10:44, Aqib Mehmood <aq...@retailo.co> wrote:
> Dear Support Team,
>
> We were setting up a Kinesis-Flink pipeline on our cluster and we seem to
> be running into an issue.
>
> It seems that a Kinesis Data Stream GetRecords limit is 5 requests/second
> and we were wondering if Flink surpasses this limit by default.
>
> Is there any metric that we can use to figure out how many API calls Flink
> is making under the hood to the Kinesis Data Stream per second to get data?
>
> Thank you in advance.
>
> Regards
> Aqib Mehmood
> Data Engineer
> Retailo
>
Re: Flink Kinesis Connector GetRecords Frequency
Posted by Ahmed Hamdy <ha...@gmail.com>.
Hi Aqib
Flink uses FlinkKinesisConsumer as the source connector to read from
kinesis sources.
You can refer to this part of the documentation
<https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/connectors/datastream/kinesis/#polling-default-record-publisher-1>on
how to configure FlinkKinesisConsumer for a certain GetRecords limit.
Regards
Ahmed Hamdy
On Wed, 25 May 2022 at 10:44, Aqib Mehmood <aq...@retailo.co> wrote:
> Dear Support Team,
>
> We were setting up a Kinesis-Flink pipeline on our cluster and we seem to
> be running into an issue.
>
> It seems that a Kinesis Data Stream GetRecords limit is 5 requests/second
> and we were wondering if Flink surpasses this limit by default.
>
> Is there any metric that we can use to figure out how many API calls Flink
> is making under the hood to the Kinesis Data Stream per second to get data?
>
> Thank you in advance.
>
> Regards
> Aqib Mehmood
> Data Engineer
> Retailo
>