You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "liwei (Jira)" <ji...@apache.org> on 2020/10/09 16:01:00 UTC
[jira] [Updated] (HUDI-1301) use spark INCREMENTAL mode query hudi
dataset support schema version
[ https://issues.apache.org/jira/browse/HUDI-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
liwei updated HUDI-1301:
------------------------
Status: Patch Available (was: In Progress)
> use spark INCREMENTAL mode query hudi dataset support schema version
> ---------------------------------------------------------------------
>
> Key: HUDI-1301
> URL: https://issues.apache.org/jira/browse/HUDI-1301
> Project: Apache Hudi
> Issue Type: Bug
> Components: Spark Integration
> Reporter: liwei
> Assignee: liwei
> Priority: Major
> Labels: pull-request-available
>
> 一、issue
> 1、 at write hand ,write two commit , second commit add a column such as:
> commit1 schema and data
> id , name
> 1, lisi
>
> commit2 schema and data
> id, name , age
> 2, zhangsan, 18
>
> 2、at read hand,
> read the latest commit return
> id, name , age
> 1, lisi, null
> 2, zhangsan, 18
>
> read the first commit by set END_INSTANTTIME_OPT_KEY to first commit, will return
> id, name , age
> 1, lisi, null
>
> 二、solution
> we can see that read the first commit alse return "age" column. i think if set END_INSTANTTIME_OPT_KEY to first commit, both schema and data should with that commit.
> more clearness should return
> id, name
> 1, lisi
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)