You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Alexander Trushev (Jira)" <ji...@apache.org> on 2022/04/27 09:05:00 UTC

[jira] [Created] (HUDI-3981) [UMBRELLA] Flink engine support for comprehensive schema evolution(RFC-33)

Alexander Trushev created HUDI-3981:
---------------------------------------

             Summary: [UMBRELLA] Flink engine support for comprehensive schema evolution(RFC-33)
                 Key: HUDI-3981
                 URL: https://issues.apache.org/jira/browse/HUDI-3981
             Project: Apache Hudi
          Issue Type: Epic
          Components: flink
            Reporter: Alexander Trushev
            Assignee: Alexander Trushev


h3. Context

Currently, there is no support of schema evolution presented in RFC-33 in flink engine.
Example 1. Assume spark changes type of column:
{code:sql}
set hoodie.schema.on.read.enable=true
create table t1 (id int, val int, par string) ... partitioned by (par)
insert into t1 values (1, 10, 'p1')
alter table t1 alter column val type string
insert into t1 values (2, 'val20', 'p2')
{code}
When flink tries to read t1:
{code:sql}
create table t1 (id int, val string, par string) partitioned by (par) with (...)
select * from t1
{code}
the error occurs:
{noformat}
java.lang.IllegalArgumentException: Unexpected type: INT32
{noformat}
This is just an example, errors may differ in the case of cow/mor/snapshot/incremental/batch/streaming/rename column/add column.

Also it is not yet possible to write data when schema is changed.
Example 2.  Case below leads to errors
{noformat}
flink: write data
flink: stop job
spark: modify schema according to RFC-33
flink: new job with modified schema
flink: write data
{noformat}

h3. Proposal

Provide full support in flink engine when schema is modified according to RFC-33
add column, rename column, change type of column, drop column when:

# batch/streaming
# mor (snapshot/incremental/optimized) read/write
# cow (snapshot/incremental) read/write
# mor compaction




--
This message was sent by Atlassian Jira
(v8.20.7#820007)