You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/02/04 15:05:53 UTC

[GitHub] [hudi] nsivabalan edited a comment on issue #2284: [SUPPORT] : Is there a option to achieve SCD 2 in Hudi?

nsivabalan edited a comment on issue #2284:
URL: https://github.com/apache/hudi/issues/2284#issuecomment-773374732


   hey folks, let me try to understand your use-case better. I am not aware of SCD2 and found [this](https://adatis.co.uk/introduction-to-slowly-changing-dimensions-scd-types/) through my friend (google ;) ). I will illustrate w/ an example and let me know if I my understanding is right. 
   
   At t1 (C1 commit) 
   // incoming record
   recId | name | .... all cols ... | effective from | effective to | isActive
    ----- |-------|------------|---------------|------------ | ----
   rec1 |  bob   | ......................| t1                        |      null        |  true
   
   this record will be stored as is in hudi w/ some additional hudi meta fields
   recId | name | .... all cols ... | effective from | effective to| isActive| hudi_commit_time | ... other meta fields 
    ----- |-------|------------|---------------|------------ | ------- |-----------| -------------------
   rec1 | bob  | ......................| t1                        |      null.        |   true |   t1                            |        .....................       
   
   At t5(C2 commit)
   // incoming record
   
    recId | name | .... all cols ... | effective from | effective to| isActive
    ----- |-------|------------|---------------|------------ | ------
    rec1 |  bob   | ......................| t5                        |      null           | true
   
   // when we merge this w/ hudi, you want to have the following rows in hudi
   recId | name | .... all cols ... | effective from | effective to| isActive | hudi_commit_time | ... other meta fields 
    ----- |-------|------------|---------------|------------ | -----|-------------| -------------------
   rec1 |  bob  | ......................| t1                        |      t5           |       false |     t1                      |        .....................       |
   rec1 |  bob   | ......................| t5                        |      null        |      true |     t5                      |        .....................       |
   
   Let me know if this is what you are looking for. We can discuss further. 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org