You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@hudi.apache.org by "liwei (Jira)" <ji...@apache.org> on 2020/10/09 16:01:00 UTC

[jira] [Updated] (HUDI-1301) use spark INCREMENTAL mode query hudi dataset support schema version

     [ https://issues.apache.org/jira/browse/HUDI-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

liwei updated HUDI-1301:
------------------------
    Status: Patch Available  (was: In Progress)

> use spark INCREMENTAL mode query hudi  dataset support schema version
> ---------------------------------------------------------------------
>
>                 Key: HUDI-1301
>                 URL: https://issues.apache.org/jira/browse/HUDI-1301
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: Spark Integration
>            Reporter: liwei
>            Assignee: liwei
>            Priority: Major
>              Labels: pull-request-available
>
> 一、issue
> 1、 at write hand ,write two commit , second commit add a column such as:
> commit1 schema and data
> id , name 
> 1, lisi
>  
> commit2  schema and data
> id, name , age
> 2, zhangsan, 18
>  
> 2、at read hand,
> read the latest commit return
> id, name , age
> 1, lisi, null
> 2, zhangsan, 18
>  
> read the first commit by set  END_INSTANTTIME_OPT_KEY to first commit, will return 
> id, name , age
> 1, lisi, null
>  
> 二、solution
> we can see that read the first commit alse return "age" column. i think if   set  END_INSTANTTIME_OPT_KEY to first commit,  both schema and data should with that commit.
>  more clearness should return 
> id, name 
> 1, lisi
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)