You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "vedantkhandelwal (via GitHub)" <gi...@apache.org> on 2023/04/27 16:41:00 UTC

[GitHub] [hudi] vedantkhandelwal commented on issue #8396: [SUPPORT] Cleaner configs not working . Need to clean .hoodie files after certain interval/batch

vedantkhandelwal commented on issue #8396:
URL: https://github.com/apache/hudi/issues/8396#issuecomment-1526014493

   I've figure out the issue. We were running our data pipelines on hudi version 8 (from feb 2022 to july 2022) then we migrated it to hudi version 9 (july to mid feb 2023). We noticed archive files were created in both .hoodie and .hoodie/archived/ directory 
   for instance:
   .hoodie/.commits_.archive.110_1-0-1
   .hoodie/archived/.commits_.archive.110_1-0-1 
   
   Then we migrated to hudi 11.1 and there cleaner was working but s3 files were increasing continuously.
   Then we deleted all archive, requested, clean, rollback, deltacommit, compaction, commit, inflight files which were older than 2 days(retaining only 2 days of files)
   Then we migrated to hudi version 12.2 and cleaner was also working fine and now files are also limited.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org