You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Aarti (Jira)" <ji...@apache.org> on 2022/07/13 11:01:00 UTC

[jira] [Created] (HADOOP-18338) Unable to access data from S3 bucket over a vpc endpoint - 400 bad request

Aarti created HADOOP-18338:
------------------------------

             Summary: Unable to access data from S3 bucket over a vpc endpoint - 400 bad request
                 Key: HADOOP-18338
                 URL: https://issues.apache.org/jira/browse/HADOOP-18338
             Project: Hadoop Common
          Issue Type: Bug
          Components: common, fs/s3
            Reporter: Aarti
         Attachments: spark_s3.txt, spark_s3_vpce_error.txt

We are trying to write to S3 bucket which has policy with specific IAM Users, SSE and endpoint.  So this bucket has 2 endpoints mentioned in policy : gateway endpoint and interface endpoint.

 

When we use gateway endpoint which is general one: [https://s3.us-east-1.amazonaws.com|https://s3.us-east-1.amazonaws.com/] => spark code executes successfully and writes to S3 bucket

But when we use interface endpoint (which we have to use ideally): [https://bucket.vpce-<>.s3.us-east-1.vpce.amazonaws.com|https://bucket.vpce-%3C%3E.s3.us-east-1.vpce.amazonaws.com/] => spark code throws an error as :

 

py4j.protocol.Py4JJavaError: An error occurred while calling o91.save.

: org.apache.hadoop.fs.s3a.AWSBadRequestException: doesBucketExist on <BUCKET NAME>: com.amazonaws.services.s3.model.AmazonS3Exception: Bad Request (Service: Amazon S3; Status Code: 400; Error Code: 400 Bad Request; Request ID: BA67GFNR0Q127VFM; S3 Extended Request ID: BopO6Cn1hNzXdWh89hZlnl/QyTJef/1cxmptuP6f4yH7tqfMO36s/7mF+q8v6L5+FmYHXbFdEss=; Proxy: null), S3 Extended Request ID: BopO6Cn1hNzXdWh89hZlnl/QyTJef/1cxmptuP6f4yH7tqfMO36s/7mF+q8v6L5+FmYHXbFdEss=:400 Bad Request: Bad Request (Service: Amazon S3; Status Code: 400; Error Code: 400 Bad Request; Request ID: BA67GFNR0Q127VFM; S3 Extended Request ID: BopO6Cn1hNzXdWh89hZlnl/QyTJef/1cxmptuP6f4yH7tqfMO36s/7mF+q8v6L5+FmYHXbFdEss=; Proxy: null)

 

Attaching the pyspark code and exception trace

  [^spark_s3.txt]

^[^spark_s3_vpce_error.txt]^



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-dev-help@hadoop.apache.org