You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Vignesh Kumar Kathiresan via user <us...@flink.apache.org> on 2022/10/17 19:47:39 UTC

Presto S3 filesystem access issue - checkpointing - EKS

Hello all,

I am trying to achieve flink application checkpointing to s3 using the
recommended presto s3 filesystem plugin.
My application is deployed in a kubernetes cluster (EKS) in flink
application mode.

When I start the application I am getting a forbidden 403 response

```Caused by:
com.facebook.presto.hive.s3.PrestoS3FileSystem$UnrecoverableS3OperationException:
com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service:
Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: ;
Proxy: null) Path: s3p://bucket/checkpoint_dir/xxx/chk-2/yyy)
at
com.facebook.presto.hive.s3.PrestoS3FileSystem.lambda$getS3ObjectMetadata$5(PrestoS3FileSystem.java:677)
~[?:?]
at com.facebook.presto.hive.RetryDriver.run(RetryDriver.java:139) ~[?:?]```

So far I have
1) the IAM role attached to the service account has been given full s3
access.
2) the config for checkpointing as
state.checkpoints.dir: s3p://BUCKET_NAME/checkpoints  (tried with s3://
 also)
3) Some digging into the presto configs and I had this one turned off too
presto.s3.use-instance-credentials: "false". (Is this right?)

Is there something I am missing(some other config to be set?) for this
checkpointing access.

P.S we have other application level access to s3 working fine

Thanks,
Vignesh

Re: [E] Re: Presto S3 filesystem access issue - checkpointing - EKS

Posted by Vignesh Kumar Kathiresan via user <us...@flink.apache.org>.
Hi Yanfei,

Thanks for the reply. I wanted to specifically try out using the IRSA role
of the EKS pods. Apparently I needed permissions for S3 managed KMS keys.

So for future reference to make checkpointing to s3 using presto s3
filesystem plugin work,

1. configs to set

   - presto.s3.use-instance-credentials: "false"
   - state.savepoints.dir: s3p://<FLINK_DATA_BUCKET_NAME>/savepoints
   - state.checkpoints.dir: s3p://<FLINK_DATA_BUCKET_NAME>/checkpoints


2. Appropriate permissions for

   - s3
   - kms -> kms:GenerateDataKey, kms:Decrypt


3. Adding AWS STS dependency

4. Setting this env variable
ENABLE_BUILT_IN_PLUGINS=flink-s3-fs-presto-1.15.2.jar



On Mon, Oct 17, 2022 at 7:51 PM yanfei lei <fr...@gmail.com> wrote:

> Hi Vignesh,
> 403 status code makes this look like an authorization issue.
>
> >
> * Some digging into the presto configs and I had this one turned off
> topresto.s3.use-instance-credentials: "false". (Is this right?)*
>
> From the document[1], it is recommended that set hive.
> *s3.use-instance-credentials* to *true* and use IAM Roles for *EC2* to
> govern access to S3.
> Maybe you can try the following two ways:
> 1) Set *s3.use-instance-credentials* to *true *and use IAM roles.
> 2) Or set hive.s3.aws-access-key and hive.s3.aws-secret-key directly.
>
>
> [1]https://prestodb.io/docs/current/connector/hive.html#s3-credentials
> <https://urldefense.com/v3/__https://prestodb.io/docs/current/connector/hive.html*s3-credentials__;Iw!!Op6eflyXZCqGR5I!DcCBzrPl2tynm_thNuOqHR0rMjkoeuxgfe2zLSpAbgosfUuu1ulePuoXe-l0_sQCnOR2CvoEeYksEhgi6Ic$>
>
> Best,
> Yanfei
>
>
>
>
> Vignesh Kumar Kathiresan via user <us...@flink.apache.org> 于2022年10月18日周二
> 03:48写道:
>
>> Hello all,
>>
>> I am trying to achieve flink application checkpointing to s3 using the
>> recommended presto s3 filesystem plugin.
>> My application is deployed in a kubernetes cluster (EKS) in flink
>> application mode.
>>
>> When I start the application I am getting a forbidden 403 response
>>
>> ```Caused by:
>> com.facebook.presto.hive.s3.PrestoS3FileSystem$UnrecoverableS3OperationException:
>> com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service:
>> Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: ;
>> Proxy: null) Path: s3p://bucket/checkpoint_dir/xxx/chk-2/yyy)
>> at
>> com.facebook.presto.hive.s3.PrestoS3FileSystem.lambda$getS3ObjectMetadata$5(PrestoS3FileSystem.java:677)
>> ~[?:?]
>> at com.facebook.presto.hive.RetryDriver.run(RetryDriver.java:139)
>> ~[?:?]```
>>
>> So far I have
>> 1) the IAM role attached to the service account has been given full s3
>> access.
>> 2) the config for checkpointing as
>> state.checkpoints.dir: s3p://BUCKET_NAME/checkpoints  (tried with s3://
>>  also)
>> 3) Some digging into the presto configs and I had this one turned off too
>> presto.s3.use-instance-credentials: "false". (Is this right?)
>>
>> Is there something I am missing(some other config to be set?) for this
>> checkpointing access.
>>
>> P.S we have other application level access to s3 working fine
>>
>> Thanks,
>> Vignesh
>>
>

Re: Presto S3 filesystem access issue - checkpointing - EKS

Posted by yanfei lei <fr...@gmail.com>.
Hi Vignesh,
403 status code makes this look like an authorization issue.

>
* Some digging into the presto configs and I had this one turned off
topresto.s3.use-instance-credentials: "false". (Is this right?)*

From the document[1], it is recommended that set hive.
*s3.use-instance-credentials* to *true* and use IAM Roles for *EC2* to
govern access to S3.
Maybe you can try the following two ways:
1) Set *s3.use-instance-credentials* to *true *and use IAM roles.
2) Or set hive.s3.aws-access-key and hive.s3.aws-secret-key directly.


[1]https://prestodb.io/docs/current/connector/hive.html#s3-credentials

Best,
Yanfei




Vignesh Kumar Kathiresan via user <us...@flink.apache.org> 于2022年10月18日周二
03:48写道:

> Hello all,
>
> I am trying to achieve flink application checkpointing to s3 using the
> recommended presto s3 filesystem plugin.
> My application is deployed in a kubernetes cluster (EKS) in flink
> application mode.
>
> When I start the application I am getting a forbidden 403 response
>
> ```Caused by:
> com.facebook.presto.hive.s3.PrestoS3FileSystem$UnrecoverableS3OperationException:
> com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service:
> Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: ;
> Proxy: null) Path: s3p://bucket/checkpoint_dir/xxx/chk-2/yyy)
> at
> com.facebook.presto.hive.s3.PrestoS3FileSystem.lambda$getS3ObjectMetadata$5(PrestoS3FileSystem.java:677)
> ~[?:?]
> at com.facebook.presto.hive.RetryDriver.run(RetryDriver.java:139)
> ~[?:?]```
>
> So far I have
> 1) the IAM role attached to the service account has been given full s3
> access.
> 2) the config for checkpointing as
> state.checkpoints.dir: s3p://BUCKET_NAME/checkpoints  (tried with s3://
>  also)
> 3) Some digging into the presto configs and I had this one turned off too
> presto.s3.use-instance-credentials: "false". (Is this right?)
>
> Is there something I am missing(some other config to be set?) for this
> checkpointing access.
>
> P.S we have other application level access to s3 working fine
>
> Thanks,
> Vignesh
>