You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/08/23 07:30:32 UTC

[GitHub] [iceberg] linfey90 opened a new pull request, #5619: Flink: Aelete the data when UPDATE_BEFORE

linfey90 opened a new pull request, #5619:
URL: https://github.com/apache/iceberg/pull/5619

   Usually when we update data, we have an -U and +U in Changelog.But When we add a condition in CDC synchronization, such as rate>10, we insert a data greater than 10 in the source, such as mysql, and the condition will be met. When we modify the data less than 10, the condition will not be met, and only a -U will be transmitted. If does not delete the data when UPDATE_BEFORE, our results will be incorrect,so I think UPDATE_BEFORE is also necessary for UPSERT, when there have not UPDATE_AFTER.
   the sql like: insert into hive_catalog.dm_ods.basic_ods select * from dm_mapping.default_catalog.basic_source where rate > 20;


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] linfey90 commented on pull request #5619: Flink: Delete the data when UPDATE_BEFORE

Posted by GitBox <gi...@apache.org>.
linfey90 commented on PR #5619:
URL: https://github.com/apache/iceberg/pull/5619#issuecomment-1225076167

   > 
   
   I don't think this is a CDC problem. UPDATE_BEFORE and UPDATE_AFTER were also generated together, but UPDATE_AFTER was filtered out because it did not meet the conditions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on pull request #5619: Flink: Delete the data when UPDATE_BEFORE

Posted by GitBox <gi...@apache.org>.
pvary commented on PR #5619:
URL: https://github.com/apache/iceberg/pull/5619#issuecomment-1224670302

   @linfey90: Could you please add a test case which fails before the fix, and successful after the fix, so we can prevent a possible later regression?
   
   Thanks, Peter 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] linfey90 commented on pull request #5619: Flink: Delete the data when UPDATE_BEFORE

Posted by GitBox <gi...@apache.org>.
linfey90 commented on PR #5619:
URL: https://github.com/apache/iceberg/pull/5619#issuecomment-1225143104

   > @linfey90: Could you please add a test case which fails before the fix, and successful after the fix, so we can prevent a possible later regression?
   > 
   > Thanks, Peter
   
   test added, but the two cases cannot occur at the same time.Please take a look again,thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on pull request #5619: Flink: Delete the data when UPDATE_BEFORE

Posted by GitBox <gi...@apache.org>.
stevenzwu commented on PR #5619:
URL: https://github.com/apache/iceberg/pull/5619#issuecomment-1224956163

   Is this a problem with the CDC source? should CDC source generate a `DELETE` in this case?
   
   Looking at the Flink Javadoc, it seems that UPDATE_BEFORE should go with  UPDATE_AFTER hand-to-hand.
   ```
       /**
        * Update operation with the previous content of the updated row.
        *
        * <p>This kind SHOULD occur together with {@link #UPDATE_AFTER} for modelling an update that
        * needs to retract the previous row first. It is useful in cases of a non-idempotent update,
        * i.e., an update of a row that is not uniquely identifiable by a key.
        */
       UPDATE_BEFORE("-U", (byte) 1),
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org