You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Aqib Mehmood <aq...@retailo.co> on 2022/05/25 09:36:54 UTC

Fwd: Flink Kinesis Connector GetRecords Frequency

Dear Support Team,

We were setting up a Kinesis-Flink pipeline on our cluster and we seem to
be running into an issue.

It seems that a Kinesis Data Stream GetRecords limit is 5 requests/second
and we were wondering if Flink surpasses this limit by default.

Is there any metric that we can use to figure out how many API calls Flink
is making under the hood to the Kinesis Data Stream per second to get data?

Thank you in advance.

Regards
Aqib Mehmood
Data Engineer
Retailo

Re: Flink Kinesis Connector GetRecords Frequency

Posted by "Teoh, Hong" <li...@amazon.co.uk.INVALID>.
Hi Aqib,



To add to Ahmed's response:

  *   The specific GetRecords Configurations you can control for the KinesisConsumer can be seen here<https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/config/ConsumerConfigConstants.java#L379-L389>.
  *   You can see the default interval is set such that it should not exceed 5 TPS per shard
  *   Do you have any other consumers on the same stream? They might contribute to limit instead of Flink. If so, you can consider using the Elastic Fan Out Consumer





Is there any metric that we can use to figure out how many API calls Flink

is making under the hood to the Kinesis Data Stream per second to get data?

No, Flink consumer does not have any specific metrics, but you can probably use CloudTrail to Query who is sending the GetRecords API request on your Kinesis Stream

https://docs.aws.amazon.com/athena/latest/ug/cloudtrail-logs.html



Regards,

Hong





On 25/05/2022, 11:30, "Ahmed Hamdy" <ha...@gmail.com> wrote:



    CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.







    Hi Aqib



    Flink uses FlinkKinesisConsumer as the source connector to read from

    kinesis sources.

    You can refer to this part of the documentation

    <https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/connectors/datastream/kinesis/#polling-default-record-publisher-1>on

    how to configure FlinkKinesisConsumer for a certain GetRecords limit.



    Regards

    Ahmed Hamdy









    On Wed, 25 May 2022 at 10:44, Aqib Mehmood <aq...@retailo.co> wrote:



    > Dear Support Team,

    >

    > We were setting up a Kinesis-Flink pipeline on our cluster and we seem to

    > be running into an issue.

    >

    > It seems that a Kinesis Data Stream GetRecords limit is 5 requests/second

    > and we were wondering if Flink surpasses this limit by default.

    >

    > Is there any metric that we can use to figure out how many API calls Flink

    > is making under the hood to the Kinesis Data Stream per second to get data?

    >

    > Thank you in advance.

    >

    > Regards

    > Aqib Mehmood

    > Data Engineer

    > Retailo

    >

Re: Flink Kinesis Connector GetRecords Frequency

Posted by Ahmed Hamdy <ha...@gmail.com>.
Hi Aqib

Flink uses FlinkKinesisConsumer as the source connector to read from
kinesis sources.
You can refer to this part of the documentation
<https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/connectors/datastream/kinesis/#polling-default-record-publisher-1>on
how to configure FlinkKinesisConsumer for a certain GetRecords limit.

Regards
Ahmed Hamdy




On Wed, 25 May 2022 at 10:44, Aqib Mehmood <aq...@retailo.co> wrote:

> Dear Support Team,
>
> We were setting up a Kinesis-Flink pipeline on our cluster and we seem to
> be running into an issue.
>
> It seems that a Kinesis Data Stream GetRecords limit is 5 requests/second
> and we were wondering if Flink surpasses this limit by default.
>
> Is there any metric that we can use to figure out how many API calls Flink
> is making under the hood to the Kinesis Data Stream per second to get data?
>
> Thank you in advance.
>
> Regards
> Aqib Mehmood
> Data Engineer
> Retailo
>