You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Alenka Frim (Jira)" <ji...@apache.org> on 2022/10/12 10:47:00 UTC

[jira] [Created] (ARROW-18001) [Python] parquet.write_table/parquet.ParquetWriter should except a subset of columns

Alenka Frim created ARROW-18001:
-----------------------------------

             Summary: [Python] parquet.write_table/parquet.ParquetWriter should except a subset of columns
                 Key: ARROW-18001
                 URL: https://issues.apache.org/jira/browse/ARROW-18001
             Project: Apache Arrow
          Issue Type: Improvement
          Components: Python
            Reporter: Alenka Frim


This question came up in the GitHub issue: [https://github.com/apache/arrow/issues/14025] and it would be a good improvement to the Parquet part of PyArrow. Haven't found any existing issue and so created a new one.
h6. Description:

If a user wants to change a type of one single column when using {{{}parquet.write_table{}}}/{{{}parquet.ParquetWriter{}}} they currently need to specify the schema with all columns included. If a column is not specified in the schema, it will not be included in the parquet file.
h6. Proposal

There should be a possibility for {{parquet.ParquetWriter}} excepting a subset of columns in a Schema and infer everything else.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)