You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Nicholas Jiang (Jira)" <ji...@apache.org> on 2021/09/16 09:20:00 UTC

[jira] [Commented] (HUDI-2441) To support partial update function which can move and update the data from the old partition to the new partition , when the data with same key change it's partition

    [ https://issues.apache.org/jira/browse/HUDI-2441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17416003#comment-17416003 ] 

Nicholas Jiang commented on HUDI-2441:
--------------------------------------

[~yanghua], I have interest in working for this issue. Could you please assign this ticket to me?

> To support partial update function which can move and update the data from the old partition to the new partition , when the data with same key change it's partition
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HUDI-2441
>                 URL: https://issues.apache.org/jira/browse/HUDI-2441
>             Project: Apache Hudi
>          Issue Type: Improvement
>          Components: Storage Management
>            Reporter: David_Liang
>            Priority: Major
>
> to considerate such a scene, there 2 reocod  as follow in the source table
> ||post_id ||position||weight||ts||day ||
> | 1|shengzhen|3KG|1630480027|{color:#ff0000}20210901{color}|
> | 1|beijing|3KG|1630652828|{color:#ff0000}20210903{color}|
>  
> when using th {color:#ff0000}*Global Index*{color} with such sql
>  
> {code:java}
> merge into target_hudi_table  t
>    using (
>         select post_id, position, ts , day from source_table
>    ) as s
> on t.id = s.id
> when natched then update set  t.position = s.position, t.ts=s.ts, t.day = s.day
> when not matched then insert *
> {code}
>  
> Beacuse now the hudi engine haven't support *cross partitions partial merge into,* the result in the target table is  
>  
> ||post_id  (as primiary key)||position||weight||ts||day||
> | 1|beijing|3KG|1630652828|*{color:#ff0000}20210901{color}*|
> the record still in  the old parition. 
>  
> but the *expected* result is 
> ||post_id  (as primiary key)||position||weight||ts||day||
> | 1|beijing|3KG|1630652828|{color:#ff0000}*20210903*{color}|
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)