You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by we...@apache.org on 2021/09/10 23:46:24 UTC
[arrow-cookbook] branch main updated: Adding anonymous flag to s3
(#70)
This is an automated email from the ASF dual-hosted git repository.
westonpace pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow-cookbook.git
The following commit(s) were added to refs/heads/main by this push:
new 9750a64 Adding anonymous flag to s3 (#70)
9750a64 is described below
commit 9750a6402436f0379a9a7bde4184076c615f5a93
Author: Tomek Drabas <dr...@gmail.com>
AuthorDate: Fri Sep 10 16:46:18 2021 -0700
Adding anonymous flag to s3 (#70)
* Adding anonymous flag to s3
* Fixing missing comma
* Info about s3 credentials
---
python/source/io.rst | 28 ++++++++++++++++++++++++++--
1 file changed, 26 insertions(+), 2 deletions(-)
diff --git a/python/source/io.rst b/python/source/io.rst
old mode 100644
new mode 100755
index 2c1fd82..db03d74
--- a/python/source/io.rst
+++ b/python/source/io.rst
@@ -394,7 +394,10 @@ partitioned data coming from remote sources like S3 or HDFS.
from pyarrow import fs
# List content of s3://ursa-labs-taxi-data/2011
- s3 = fs.SubTreeFileSystem("ursa-labs-taxi-data", fs.S3FileSystem(region="us-east-2"))
+ s3 = fs.SubTreeFileSystem(
+ "ursa-labs-taxi-data",
+ fs.S3FileSystem(region="us-east-2", anonymous=True)
+ )
for entry in s3.get_file_info(fs.FileSelector("2011", recursive=True)):
if entry.type == fs.FileType.File:
print(entry.path)
@@ -419,7 +422,7 @@ by ``month`` using
.. testcode::
- dataset = ds.dataset("s3://ursa-labs-taxi-data/2011",
+ dataset = ds.dataset("s3://ursa-labs-taxi-data/2011",
partitioning=["month"])
for f in dataset.files[:10]:
print(f)
@@ -447,6 +450,27 @@ or :meth:`pyarrow.dataset.Dataset.to_batches` like you would for a local one.
It is possible to load partitioned data also in the ipc arrow
format or in feather format.
+.. warning::
+
+ If the above code throws an error most likely the reason is your
+ AWS credentials are not set. Follow these instructions to get
+ ``AWS Access Key Id`` and ``AWS Secret Access Key``:
+ `AWS Credentials <https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html>`_.
+
+ The credentials are normally stored in ``~/.aws/credentials`` (on Mac or Linux)
+ or in ``C:\Users\<USERNAME>\.aws\credentials`` (on Windows) file.
+ You will need to either create or update this file in the appropriate location.
+
+ The contents of the file should look like this:
+
+ .. code-block:: bash
+
+ [default]
+ aws_access_key_id=<YOUR_AWS_ACCESS_KEY_ID>
+ aws_secret_access_key=<YOUR_AWS_SECRET_ACCESS_KEY>
+
+
+
Write a Feather file
====================