You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Mikhail Tolmachev (Jira)" <ji...@apache.org> on 2021/11/09 12:59:00 UTC

[jira] [Created] (ARROW-14640) reading data from S3

Mikhail Tolmachev created ARROW-14640:
-----------------------------------------

             Summary: reading data from S3
                 Key: ARROW-14640
                 URL: https://issues.apache.org/jira/browse/ARROW-14640
             Project: Apache Arrow
          Issue Type: Bug
          Components: R
    Affects Versions: 6.0.0
            Reporter: Mikhail Tolmachev


I am trying to read data directly from S3. In my work pipeline I work under proxy, so I set system environments HTTP_PROXY and HTTPS_PROXY. Also I have set AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY.  I have constructed links for reading:
{code:java}
s3_uri <- paste0("s3://", accessKeyId, ":", URLencode(secretAccessKey, reserved = T),"@", bucketName, "/", path) 
s3_uri_short <- paste0("s3://", bucketName, "/", path)
bucketS3 <- arrow::s3_bucket(bucket = bucketName, access_key = accessKeyId,
                       secret_key = URLencode(secretAccessKey, reserved = T)){code}
But none of them worked for arrow::read_parquet
{code:java}
df <- arrow::read_parquet(s3_uri)
df <- arrow::read_parquet(s3_uri_short) 
df <- arrow::read_parquet(bucketS3$path(x = path)){code}
Error
{code:java}
 IOError: When reading information for key AWS Error [code 15]: No response body. with address : {code}
How can I fix the problem or maybe get more informative output ? Thanks for attention. 

I knew about same issue https://issues.apache.org/jira/browse/ARROW-12126 but I can't reopen it.

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)