You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/02/03 02:26:00 UTC

[GitHub] [iceberg] Stephen-Robin edited a comment on pull request #2196: Core: Data loss after compaction #2195

Stephen-Robin edited a comment on pull request #2196:
URL: https://github.com/apache/iceberg/pull/2196#issuecomment-772160296


   > I am sorry, this is a known bug,I had found the bug when I did the Rewrite Action,and I had open a PR #1762 , just not merged ,the purpose of this rewrite Action is to compaction small files, so I think it is more reasonable to exclude data files which size > the target size during table scan.
   
   @zhangjun0x01  
   Hi zhangjun, I found that large files exceeding the threshold were filtered out in PR.[#1762](https://github.com/apache/iceberg/pull/1762/files) , Perhapsd the rewrite data file operation should not only includes small file merging, but also large files should be segmented and rewritten. This PR has already rewritten large files after segmentation. What do you think about this, and thanks rdblue for pushing this issue.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org