You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Wes McKinney (Jira)" <ji...@apache.org> on 2021/10/12 19:52:00 UTC

[jira] [Created] (ARROW-14303) [C++][Parquet] Do not duplicate Schema metadata in Parquet schema metadata and serialized ARROW:schema value

Wes McKinney created ARROW-14303:
------------------------------------

             Summary: [C++][Parquet] Do not duplicate Schema metadata in Parquet schema metadata and serialized ARROW:schema value
                 Key: ARROW-14303
                 URL: https://issues.apache.org/jira/browse/ARROW-14303
             Project: Apache Arrow
          Issue Type: Bug
          Components: C++
            Reporter: Wes McKinney
             Fix For: 6.0.0


Metadata values are being duplicated in the Parquet file footer — we should either only store them in ARROW:schema or the Parquet schema metadata. Removing them from the Parquet schema metadata may break applications that are expecting that metadata to be there when serialized from Arrow, so dropping the keys from ARROW:schema is probably a safer choice



--
This message was sent by Atlassian Jira
(v8.3.4#803005)