You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@flink.apache.org by Rafi Aroch <ra...@gmail.com> on 2019/03/27 12:04:32 UTC

Support for Parquet schema evolution (a.k.a mergeSchema)

Hi,

In my job I want to read Parquet files from buckets by a date range.
For that i'm using the Hadoop Compatibility features to use
*ProtoParquetInputFormat*.
If in the processed date range the Parquet schema underwent changes (even
valid ones). Job fails with *IncompatibleSchemaModificationException.*

Did anyone encounter such issue?
Is that a known limitation?
Is there a way to solve this?

Would appreciate your help.

Thanks,
Rafi