You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Antoine Pitrou (Jira)" <ji...@apache.org> on 2020/09/08 10:05:00 UTC

[jira] [Commented] (ARROW-9775) [C++] Automatic S3 region selection

    [ https://issues.apache.org/jira/browse/ARROW-9775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192096#comment-17192096 ] 

Antoine Pitrou commented on ARROW-9775:
---------------------------------------

It seems it can be determined through a HEAD request on a bucket:
https://github.com/aws/aws-cli/issues/2431

This is how boto does it:
https://github.com/boto/botocore/pull/936/files

A S3Client is bound to a region, so some care will be needed in the implementation.

> [C++] Automatic S3 region selection
> -----------------------------------
>
>                 Key: ARROW-9775
>                 URL: https://issues.apache.org/jira/browse/ARROW-9775
>             Project: Apache Arrow
>          Issue Type: Wish
>          Components: C++, Python
>         Environment: macOS, Linux.
>            Reporter: Sahil Gupta
>            Priority: Major
>              Labels: filesystem
>             Fix For: 2.0.0
>
>
> Currently, PyArrow and ArrowCpp need to be provided the region of the S3 file/bucket, else it defaults to using 'us-east-1'. Ideally, PyArrow and ArrowCpp can automatically detect the region and get the files, etc. For instance, s3fs and boto3 can read and write files without having to specify the region explicitly. Similar functionality to auto-detect the region would be great to have in PyArrow and ArrowCpp.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)