You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by jo...@apache.org on 2023/06/13 12:37:54 UTC
[arrow] branch main updated: GH-35858: [Python] disallow none schema parquet writer (#36011)
This is an automated email from the ASF dual-hosted git repository.
jorisvandenbossche pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow.git
The following commit(s) were added to refs/heads/main by this push:
new 1ddeaab30b GH-35858: [Python] disallow none schema parquet writer (#36011)
1ddeaab30b is described below
commit 1ddeaab30b905b97c1c17a41daa1b3cd923e91d9
Author: Weston Pace <we...@gmail.com>
AuthorDate: Tue Jun 13 05:37:39 2023 -0700
GH-35858: [Python] disallow none schema parquet writer (#36011)
### Rationale for this change
Previously, passing in None for the schema would cause a segmentation fault.
### What changes are included in this PR?
Now a TypeError is raised instead
### Are these changes tested?
Yes, a new unit test is created
### Are there any user-facing changes?
No
* Closes: #35858
Authored-by: Weston Pace <we...@gmail.com>
Signed-off-by: Joris Van den Bossche <jo...@gmail.com>
---
python/pyarrow/_parquet.pyx | 2 +-
python/pyarrow/tests/parquet/test_parquet_writer.py | 8 ++++++++
2 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/python/pyarrow/_parquet.pyx b/python/pyarrow/_parquet.pyx
index 2fc0494cbc..f9cd5289c7 100644
--- a/python/pyarrow/_parquet.pyx
+++ b/python/pyarrow/_parquet.pyx
@@ -1691,7 +1691,7 @@ cdef class ParquetWriter(_Weakrefable):
int64_t dictionary_pagesize_limit
object store_schema
- def __cinit__(self, where, Schema schema, use_dictionary=None,
+ def __cinit__(self, where, Schema schema not None, use_dictionary=None,
compression=None, version=None,
write_statistics=None,
MemoryPool memory_pool=None,
diff --git a/python/pyarrow/tests/parquet/test_parquet_writer.py b/python/pyarrow/tests/parquet/test_parquet_writer.py
index 6ae4307135..e6fbd97053 100644
--- a/python/pyarrow/tests/parquet/test_parquet_writer.py
+++ b/python/pyarrow/tests/parquet/test_parquet_writer.py
@@ -93,6 +93,14 @@ def test_validate_schema_write_table(tempdir):
with pytest.raises(ValueError):
w.write_table(simple_table)
+def test_parquet_invalid_writer():
+
+ with pytest.raises(TypeError):
+ some_schema = pa.schema([pa.field("x", pa.int32())])
+ pq.ParquetWriter(None, some_schema)
+
+ with pytest.raises(TypeError):
+ pq.ParquetWriter("some_path", None)
@pytest.mark.pandas
@parametrize_legacy_dataset