You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by fs...@apache.org on 2020/06/25 16:28:24 UTC
[arrow] branch master updated: ARROW-1682: [Doc] Expand S3/MinIO
fileystem dataset documentation
This is an automated email from the ASF dual-hosted git repository.
fsaintjacques pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git
The following commit(s) were added to refs/heads/master by this push:
new 8bd34e8 ARROW-1682: [Doc] Expand S3/MinIO fileystem dataset documentation
8bd34e8 is described below
commit 8bd34e869181b0dc4f03d15e989d9e511042790f
Author: François Saint-Jacques <fs...@gmail.com>
AuthorDate: Thu Jun 25 12:27:58 2020 -0400
ARROW-1682: [Doc] Expand S3/MinIO fileystem dataset documentation
Closes #7517 from fsaintjacques/ARROW-1682
Authored-by: François Saint-Jacques <fs...@gmail.com>
Signed-off-by: François Saint-Jacques <fs...@gmail.com>
---
docs/source/python/dataset.rst | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/docs/source/python/dataset.rst b/docs/source/python/dataset.rst
index ae14e39..3d99834 100644
--- a/docs/source/python/dataset.rst
+++ b/docs/source/python/dataset.rst
@@ -325,6 +325,24 @@ The currently available classes are :class:`~pyarrow.fs.S3FileSystem` and
details.
+Reading from Minio
+------------------
+
+In addition to cloud storage, pyarrow also supports reading from a
+`MinIO https://github.com/minio/minio`_ object storage instance emulating S3
+APIs. Paired with `toxiproxy https://github.com/shopify/toxiproxy`_, this is
+useful for testing or benchmarking.
+
+.. code-block:: python
+
+ from pyarrow import fs
+
+ # By default, MinIO will listen for unencrypted HTTP traffic.
+ minio = fs.S3FileSystem(scheme="http", endpoint="localhost:9000")
+ dataset = ds.dataset("ursa-labs-taxi-data/", filesystem=minio,
+ partitioning=["year", "month"])
+
+
Manual specification of the Dataset
-----------------------------------