You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Joris Van den Bossche (Jira)" <ji...@apache.org> on 2022/10/11 12:16:00 UTC

[jira] [Updated] (ARROW-16337) [Python] Expose parameter that determines to store Arrow schema in Parquet metadata in Python

     [ https://issues.apache.org/jira/browse/ARROW-16337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joris Van den Bossche updated ARROW-16337:
------------------------------------------
    Fix Version/s: 10.0.0

> [Python] Expose parameter that determines to store Arrow schema in Parquet metadata in Python
> ---------------------------------------------------------------------------------------------
>
>                 Key: ARROW-16337
>                 URL: https://issues.apache.org/jira/browse/ARROW-16337
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: Python
>            Reporter: Joris Van den Bossche
>            Assignee: Joris Van den Bossche
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 10.0.0
>
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> There is a {{store_schema}} flag that determines whether we store the Arrow schema in the Parquet metadata (under the {{ARROW:schema}} key) or not. This is exposed in the C++, but not in the Python interface. It would be good to also expose this in the Python layer, to more easily experiment with this (eg to check the impact of having the schema available or not when reading a file)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)