You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by "zlzhang0122 (via GitHub)" <gi...@apache.org> on 2024/04/17 12:40:29 UTC

[I] Iceberg may occur data duplication when use flink to write data to iceberg and commit failed [iceberg]

zlzhang0122 opened a new issue, #10165:
URL: https://github.com/apache/iceberg/issues/10165

   ### Apache Iceberg version
   
   1.3.0
   
   ### Query engine
   
   Spark
   
   ### Please describe the bug 🐞
   
   Iceberg may occur data duplication when use flink to write data to iceberg and commit failed, it cannot distinguish the snapshot emit by each checkpoint, and then once the committer stuck for a moment or similar situation, it will commit all the snapshots produced by current checkpoint and all previous checkpoints.
   We can modify some flink config to avoid this problem, but I think this is not a perfect resolvent since it can cause other relative problem, maybe we can emit the snapshots and checkpointId together and we can solve it completely. What do u think and any reply will be appreciated, thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


Re: [I] Iceberg may occur data duplication when use flink to write data to iceberg and commit failed [iceberg]

Posted by "pvary (via GitHub)" <gi...@apache.org>.
pvary closed issue #10165: Iceberg may occur data duplication when use flink to write data to iceberg and commit failed
URL: https://github.com/apache/iceberg/issues/10165


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


Re: [I] Iceberg may occur data duplication when use flink to write data to iceberg and commit failed [iceberg]

Posted by "pvary (via GitHub)" <gi...@apache.org>.
pvary commented on issue #10165:
URL: https://github.com/apache/iceberg/issues/10165#issuecomment-2065036455

   Could you please describe the exact si


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org