You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@beam.apache.org by Sankar Subramaniam <sa...@bbc.co.uk> on 2022/11/11 11:34:27 UTC

Reading from AWS Kinesis Stream Cross account

Hello there,

Good morning.

We are using Apache Beam (Java SDK 2.35.0) in our data pipeline to read from AWS Kinesis Stream using AWS KDA (Kinesis Data Analytics) and so far it’s working fine for few data pipelines.

Now we have got a new requirement that AWS KDA (running an application implemented using Apache Bean SDK 2.35.0) needs to read the data from AWS Kinesis Stream in different account.

I have followed AWS Documentation<https://docs.aws.amazon.com/kinesisanalytics/latest/java/examples-cross.html> to grant required permission and policies for the AWS KDA and in Apache Bean implementation we have something like below,

KinesisIO.read()
    .withStreamName(getInputPattern())
    .withAWSClientsProvider(
        new KinesisClientsProvider(
            Regions.fromName(getAwsRegion()),
            getAwsCredentialsProvider(),
            getAwsVerifyCertificate(),
            getAwsKinesisServiceEndpoint(),
            getAwsCloudwatchServiceEndpoint()))

Here, we could only set the Kinesis Stream and not the ARN. With the above implementation, this application couldn’t read from the stream and from the logs we are seeing it’s trying to connect to the stream in the same AWS Account. The ARN formed using streamName assumes it’s in the same AWS Account whereas we want to connect to Kinesis Stream in another AWS Account.

Note: We are using ‘DefaultAWSCredentialsProviderChain’.

With this situation, wondering am I missing something / doing incorrectly here. Could you please give us some pointers how to use Beam to read from a (Kinesis)Stream in different AWS Account. Thanks.

Regards,
Sankar


Re: Reading from AWS Kinesis Stream Cross account

Posted by Sankar Subramaniam <sa...@bbc.co.uk>.
Thanks Alexey and Moritz for your valuable inputs.

We were using ‘STSAssumeRoleSessionCredentialsProvider’ because of AWS SDKv1. Now as per response and the issue, we will upgrade AWS SDK and Beam SDK and give it a go. Will keep you posted on this in few days time. Thanks.

Regards,
Sankar

From: Moritz Mack <mm...@talend.com>
Reply to: "user@beam.apache.org" <us...@beam.apache.org>
Date: Tuesday, 15 November 2022 at 10:45
To: Alexey Romanenko <ar...@gmail.com>, "user@beam.apache.org" <us...@beam.apache.org>
Subject: Re: Reading from AWS Kinesis Stream Cross account

Hi Sankar,

First, as Alexey pointed out, please try and migrate to the Beam AWS SDK v2 as soon as possible. The SDK v1 (including the Kinesis module) has been long deprecated and will be removed some time soon.

The AWS API doesn’t support cross-account access for Kinesis using an ARN. This is always based on the stream name as shown in the example you’ve linked [1].  You must use STS / assume role credentials to do this, it can’t be done using DefaultAWSCredentialsProviderChain.

Assuming you’ve correctly configured all required policies and roles (on both accounts) following [1], you can then use the StsAssumeRoleCredentialsProvider. For the Beam AWS SDK v2 this can be done using pipeline options [2], or programmatically of course.
Support for StsAssumeRoleCredentialsProvider is more limited in SDK v1, though it should also work.

--awsCredentialsProvider={
   "@type": "StsAssumeRoleCredentialsProvider",
   "roleArn": "arn:aws:iam::SOURCE01234567:role/KA-Source-Stream-Role",
   "roleSessionName": "ksassumedrolesession"
}

The STS client will implicitly use the DefaultAWSCredentialsProviderChain to assume the above role of the source account using an authenticated principal of the sink account (source/sink nomenclature as used in [1]). Please make sure your environment is configured accordingly, this part is easy to miss.

Let me know if you have more questions!
Regards,
Moritz


[1] https://docs.aws.amazon.com/kinesisanalytics/latest/java/examples-cross.html
[2] https://beam.apache.org/releases/javadoc/2.42.0/org/apache/beam/sdk/io/aws2/options/AwsOptions.html#getAwsCredentialsProvider--

On 14.11.22, 19:05, "Alexey Romanenko" <ar...@gmail.com> wrote:

If I’m not mistaken, it’s not supported in the current implementation of KinesisIO. PS: Btw, if you still use KinesisIO connector based on AWS SDK v1 [1] then it’s highly recommended to switch to one based on AWS SDK v2 [2] since former is

If I’m not mistaken, it’s not supported in the current implementation of KinesisIO.

PS: Btw, if you still use KinesisIO connector based on AWS SDK v1 [1] then it’s highly recommended to switch to one based on AWS SDK v2 [2] since former is deprecated.

[1] https://github.com/apache/beam/blob/master/sdks/java/io/kinesis/src/main/java/org/apache/beam/sdk/io/kinesis/KinesisIO.java<https://urldefense.com/v3/__https:/github.com/apache/beam/blob/master/sdks/java/io/kinesis/src/main/java/org/apache/beam/sdk/io/kinesis/KinesisIO.java__;!!CiXD_PY!XzTvAWqfcGyw-k0V1j7SeMMVBnlyM1dysSffSXz1PcoXs_RSX3GuJf_sXTE4Sk9QIou906p0z_tyKTCweGmqKQ$>
[2] https://github.com/apache/beam/blob/master/sdks/java/io/amazon-web-services2/src/main/java/org/apache/beam/sdk/io/aws2/kinesis/KinesisIO.java<https://urldefense.com/v3/__https:/github.com/apache/beam/blob/master/sdks/java/io/amazon-web-services2/src/main/java/org/apache/beam/sdk/io/aws2/kinesis/KinesisIO.java__;!!CiXD_PY!XzTvAWqfcGyw-k0V1j7SeMMVBnlyM1dysSffSXz1PcoXs_RSX3GuJf_sXTE4Sk9QIou906p0z_tyKTBLSThXSg$>

—
Alexey

On 11 Nov 2022, at 12:34, Sankar Subramaniam <sa...@bbc.co.uk> wrote:

Hello there,

Good morning.

We are using Apache Beam (Java SDK 2.35.0) in our data pipeline to read from AWS Kinesis Stream using AWS KDA (Kinesis Data Analytics) and so far it’s working fine for few data pipelines.

Now we have got a new requirement that AWS KDA (running an application implemented using Apache Bean SDK 2.35.0) needs to read the data from AWS Kinesis Stream in different account.

I have followed AWS Documentation<https://urldefense.com/v3/__https:/docs.aws.amazon.com/kinesisanalytics/latest/java/examples-cross.html__;!!CiXD_PY!XzTvAWqfcGyw-k0V1j7SeMMVBnlyM1dysSffSXz1PcoXs_RSX3GuJf_sXTE4Sk9QIou906p0z_tyKTBEETAzMw$> to grant required permission and policies for the AWS KDA and in Apache Bean implementation we have something like below,

KinesisIO.read()
    .withStreamName(getInputPattern())
    .withAWSClientsProvider(
        new KinesisClientsProvider(
            Regions.fromName(getAwsRegion()),
            getAwsCredentialsProvider(),
            getAwsVerifyCertificate(),
            getAwsKinesisServiceEndpoint(),
            getAwsCloudwatchServiceEndpoint()))

Here, we could only set the Kinesis Stream and not the ARN. With the above implementation, this application couldn’t read from the stream and from the logs we are seeing it’s trying to connect to the stream in the same AWS Account. The ARN formed using streamName assumes it’s in the same AWS Account whereas we want to connect to Kinesis Stream in another AWS Account.

Note: We are using ‘DefaultAWSCredentialsProviderChain’.

With this situation, wondering am I missing something / doing incorrectly here. Could you please give us some pointers how to use Beam to read from a (Kinesis)Stream in different AWS Account. Thanks.

Regards,
Sankar


As a recipient of an email from Talend, your contact personal data will be on our systems. Please see our privacy notice. <https://www.talend.com/privacy/>




----------------------------

http://www.bbc.co.uk
This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated.
If you have received it in error, please delete it from your system.
Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately.
Please note that the BBC monitors e-mails sent or received.
Further communication will signify your consent to this.

---------------------

Re: Reading from AWS Kinesis Stream Cross account

Posted by Moritz Mack <mm...@talend.com>.
Hi Sankar,

First, as Alexey pointed out, please try and migrate to the Beam AWS SDK v2 as soon as possible. The SDK v1 (including the Kinesis module) has been long deprecated and will be removed some time soon.

The AWS API doesn’t support cross-account access for Kinesis using an ARN. This is always based on the stream name as shown in the example you’ve linked [1].  You must use STS / assume role credentials to do this, it can’t be done using DefaultAWSCredentialsProviderChain.

Assuming you’ve correctly configured all required policies and roles (on both accounts) following [1], you can then use the StsAssumeRoleCredentialsProvider. For the Beam AWS SDK v2 this can be done using pipeline options [2], or programmatically of course.
Support for StsAssumeRoleCredentialsProvider is more limited in SDK v1, though it should also work.

--awsCredentialsProvider={
   "@type": "StsAssumeRoleCredentialsProvider",
   "roleArn": "arn:aws:iam::SOURCE01234567:role/KA-Source-Stream-Role",
   "roleSessionName": "ksassumedrolesession"
}

The STS client will implicitly use the DefaultAWSCredentialsProviderChain to assume the above role of the source account using an authenticated principal of the sink account (source/sink nomenclature as used in [1]). Please make sure your environment is configured accordingly, this part is easy to miss.

Let me know if you have more questions!
Regards,
Moritz


[1] https://docs.aws.amazon.com/kinesisanalytics/latest/java/examples-cross.html
[2] https://beam.apache.org/releases/javadoc/2.42.0/org/apache/beam/sdk/io/aws2/options/AwsOptions.html#getAwsCredentialsProvider--

On 14.11.22, 19:05, "Alexey Romanenko" <ar...@gmail.com> wrote:

If I’m not mistaken, it’s not supported in the current implementation of KinesisIO. PS: Btw, if you still use KinesisIO connector based on AWS SDK v1 [1] then it’s highly recommended to switch to one based on AWS SDK v2 [2] since former is

If I’m not mistaken, it’s not supported in the current implementation of KinesisIO.

PS: Btw, if you still use KinesisIO connector based on AWS SDK v1 [1] then it’s highly recommended to switch to one based on AWS SDK v2 [2] since former is deprecated.

[1] https://github.com/apache/beam/blob/master/sdks/java/io/kinesis/src/main/java/org/apache/beam/sdk/io/kinesis/KinesisIO.java<https://urldefense.com/v3/__https:/github.com/apache/beam/blob/master/sdks/java/io/kinesis/src/main/java/org/apache/beam/sdk/io/kinesis/KinesisIO.java__;!!CiXD_PY!XzTvAWqfcGyw-k0V1j7SeMMVBnlyM1dysSffSXz1PcoXs_RSX3GuJf_sXTE4Sk9QIou906p0z_tyKTCweGmqKQ$>
[2] https://github.com/apache/beam/blob/master/sdks/java/io/amazon-web-services2/src/main/java/org/apache/beam/sdk/io/aws2/kinesis/KinesisIO.java<https://urldefense.com/v3/__https:/github.com/apache/beam/blob/master/sdks/java/io/amazon-web-services2/src/main/java/org/apache/beam/sdk/io/aws2/kinesis/KinesisIO.java__;!!CiXD_PY!XzTvAWqfcGyw-k0V1j7SeMMVBnlyM1dysSffSXz1PcoXs_RSX3GuJf_sXTE4Sk9QIou906p0z_tyKTBLSThXSg$>

—
Alexey


On 11 Nov 2022, at 12:34, Sankar Subramaniam <sa...@bbc.co.uk> wrote:

Hello there,

Good morning.

We are using Apache Beam (Java SDK 2.35.0) in our data pipeline to read from AWS Kinesis Stream using AWS KDA (Kinesis Data Analytics) and so far it’s working fine for few data pipelines.

Now we have got a new requirement that AWS KDA (running an application implemented using Apache Bean SDK 2.35.0) needs to read the data from AWS Kinesis Stream in different account.

I have followed AWS Documentation<https://urldefense.com/v3/__https:/docs.aws.amazon.com/kinesisanalytics/latest/java/examples-cross.html__;!!CiXD_PY!XzTvAWqfcGyw-k0V1j7SeMMVBnlyM1dysSffSXz1PcoXs_RSX3GuJf_sXTE4Sk9QIou906p0z_tyKTBEETAzMw$> to grant required permission and policies for the AWS KDA and in Apache Bean implementation we have something like below,

KinesisIO.read()
    .withStreamName(getInputPattern())
    .withAWSClientsProvider(
        new KinesisClientsProvider(
            Regions.fromName(getAwsRegion()),
            getAwsCredentialsProvider(),
            getAwsVerifyCertificate(),
            getAwsKinesisServiceEndpoint(),
            getAwsCloudwatchServiceEndpoint()))

Here, we could only set the Kinesis Stream and not the ARN. With the above implementation, this application couldn’t read from the stream and from the logs we are seeing it’s trying to connect to the stream in the same AWS Account. The ARN formed using streamName assumes it’s in the same AWS Account whereas we want to connect to Kinesis Stream in another AWS Account.

Note: We are using ‘DefaultAWSCredentialsProviderChain’.

With this situation, wondering am I missing something / doing incorrectly here. Could you please give us some pointers how to use Beam to read from a (Kinesis)Stream in different AWS Account. Thanks.

Regards,
Sankar


As a recipient of an email from Talend, your contact personal data will be on our systems. Please see our privacy notice. <https://www.talend.com/privacy/>



Re: Reading from AWS Kinesis Stream Cross account

Posted by Alexey Romanenko <ar...@gmail.com>.
If I’m not mistaken, it’s not supported in the current implementation of KinesisIO.

PS: Btw, if you still use KinesisIO connector based on AWS SDK v1 [1] then it’s highly recommended to switch to one based on AWS SDK v2 [2] since former is deprecated.

[1] https://github.com/apache/beam/blob/master/sdks/java/io/kinesis/src/main/java/org/apache/beam/sdk/io/kinesis/KinesisIO.java
[2] https://github.com/apache/beam/blob/master/sdks/java/io/amazon-web-services2/src/main/java/org/apache/beam/sdk/io/aws2/kinesis/KinesisIO.java

—
Alexey

> On 11 Nov 2022, at 12:34, Sankar Subramaniam <sa...@bbc.co.uk> wrote:
> 
> Hello there,
>  
> Good morning.
>  
> We are using Apache Beam (Java SDK 2.35.0) in our data pipeline to read from AWS Kinesis Stream using AWS KDA (Kinesis Data Analytics) and so far it’s working fine for few data pipelines.
>  
> Now we have got a new requirement that AWS KDA (running an application implemented using Apache Bean SDK 2.35.0) needs to read the data from AWS Kinesis Stream in different account.
>  
> I have followed AWS Documentation <https://docs.aws.amazon.com/kinesisanalytics/latest/java/examples-cross.html> to grant required permission and policies for the AWS KDA and in Apache Bean implementation we have something like below,
>  
> KinesisIO.read()
>     .withStreamName(getInputPattern())
>     .withAWSClientsProvider(
>         new KinesisClientsProvider(
>             Regions.fromName(getAwsRegion()),
>             getAwsCredentialsProvider(),
>             getAwsVerifyCertificate(),
>             getAwsKinesisServiceEndpoint(),
>             getAwsCloudwatchServiceEndpoint()))
>  
> Here, we could only set the Kinesis Stream and not the ARN. With the above implementation, this application couldn’t read from the stream and from the logs we are seeing it’s trying to connect to the stream in the same AWS Account. The ARN formed using streamName assumes it’s in the same AWS Account whereas we want to connect to Kinesis Stream in another AWS Account.
>  
> Note: We are using ‘DefaultAWSCredentialsProviderChain’.
>  
> With this situation, wondering am I missing something / doing incorrectly here. Could you please give us some pointers how to use Beam to read from a (Kinesis)Stream in different AWS Account. Thanks.
>  
> Regards,
> Sankar