You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by jo...@apache.org on 2022/05/09 13:15:02 UTC
[arrow] branch master updated: MINOR: [Python][Docs] Improving sentence on docs:python/parquet
This is an automated email from the ASF dual-hosted git repository.
jorisvandenbossche pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git
The following commit(s) were added to refs/heads/master by this push:
new 35119f29b0 MINOR: [Python][Docs] Improving sentence on docs:python/parquet
35119f29b0 is described below
commit 35119f29b0e0de68b1ccc5f2066e0cc7d27fddd0
Author: alexdesiqueira <al...@igdore.org>
AuthorDate: Mon May 9 15:14:51 2022 +0200
MINOR: [Python][Docs] Improving sentence on docs:python/parquet
Just a small improvement on `docs/source/python/parquet.rst`:
> _We need not use_
becomes
> _We do not need to use_
Closes #13093 from alexdesiqueira/small_wording-python_parquet
Authored-by: alexdesiqueira <al...@igdore.org>
Signed-off-by: Joris Van den Bossche <jo...@gmail.com>
---
docs/source/python/parquet.rst | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/docs/source/python/parquet.rst b/docs/source/python/parquet.rst
index baffc55544..7048670c82 100644
--- a/docs/source/python/parquet.rst
+++ b/docs/source/python/parquet.rst
@@ -103,7 +103,7 @@ source, we use ``read_pandas`` to maintain any additional index column data:
pq.read_pandas('example.parquet', columns=['two']).to_pandas()
-We need not use a string to specify the origin of the file. It can be any of:
+We do not need to use a string to specify the origin of the file. It can be any of:
* A file path as a string
* A :ref:`NativeFile <io.native_file>` from PyArrow
@@ -118,7 +118,7 @@ maps) will perform the best.
Reading Parquet and Memory Mapping
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Because Parquet data needs to be decoded from the Parquet format
+Because Parquet data needs to be decoded from the Parquet format
and compression, it can't be directly mapped from disk.
Thus the ``memory_map`` option might perform better on some systems
but won't help much with resident memory consumption.
@@ -131,9 +131,9 @@ but won't help much with resident memory consumption.
>>> pq_array = pa.parquet.read_table("area1.parquet", memory_map=False)
>>> print("RSS: {}MB".format(pa.total_allocated_bytes() >> 20))
- RSS: 4299MB
+ RSS: 4299MB
-If you need to deal with Parquet data bigger than memory,
+If you need to deal with Parquet data bigger than memory,
the :ref:`dataset` and partitioning is probably what you are looking for.
Parquet file writing options
@@ -756,7 +756,7 @@ An example encryption configuration:
Decryption configuration
~~~~~~~~~~~~~~~~~~~~~~~~
-
+
:class:`pyarrow.parquet.encryption.DecryptionConfiguration` (used when creating
file decryption properties) is optional and it includes the following options: