You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/12/06 07:37:17 UTC

[GitHub] [iceberg] zstraw commented on issue #4550: the snapshot file is lost when write iceberg using flink Failed to open input stream for file File does not exist

zstraw commented on issue #4550:
URL: https://github.com/apache/iceberg/issues/4550#issuecomment-1338908066

   After deeping into iceberg code and the log, I can reproduce it in debugging locally.
   
   The scenario may happens in the process of Flink cancelling.
   1. IcebergFileCommitter is going to commit file. In the step of **rename** metadata.json(org.apache.iceberg.hadoop.HadoopTableOperations#renameToFinal), org.apache.hadoop.ipc.Client.call encounters **InterruptedIOException**. I suspect it comes from Flink task cancelling. On the other hand, **Hdfs has renamed the metada.json file sucessfully**.
   2. After rename fails, it's supposed to retry. But the thread encounters InterruptedException in sleeping(org.apache.iceberg.util.Tasks#runTaskWithRetry). Then it will throw a RuntimeException. And the version-hint will not be updated.
   3. The RuntimeException leads to **rollback** in org.apache.iceberg.BaseTransaction(#cleanUpOnCommitFailure), which will delete manifest list (snap-XXX).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org