You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@parquet.apache.org by "Wes McKinney (JIRA)" <ji...@apache.org> on 2019/05/01 22:46:00 UTC

[jira] [Resolved] (PARQUET-810) [C++] Read from file with schema evolution

     [ https://issues.apache.org/jira/browse/PARQUET-810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Wes McKinney resolved PARQUET-810.
----------------------------------
    Resolution: Won't Fix

I think this is an application-level concern. 

We provide both low-level and high-level (Arrow) APIs for reading Parquet files, and examining their schemas. If the user is reading files with an expanded schema, at read time they can determine how to project the available columns into the target schema. I don't think this logic needs to be implemented in parquet-cpp

> [C++] Read from file with schema evolution
> ------------------------------------------
>
>                 Key: PARQUET-810
>                 URL: https://issues.apache.org/jira/browse/PARQUET-810
>             Project: Parquet
>          Issue Type: Improvement
>          Components: parquet-cpp
>            Reporter: Wes McKinney
>            Priority: Major
>
> In a large dataset, new optional fields may appear later in the data's lifetime. Assuming we know the current schema and all fields, we must provide an API to read from an older file that is missing some of the optional fields in its metadata. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)