You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/11/12 11:37:03 UTC

[GitHub] [hudi] veenaypatil commented on pull request #3646: [HUDI-349]: Added new cleaning policy based on number of hours

veenaypatil commented on pull request #3646:
URL: https://github.com/apache/hudi/pull/3646#issuecomment-967034819


   any update on this , I faced an issue today where the ETL failed in Prod because the file was deleted by Cleaner service
   ```
   Caused by: java.io.FileNotFoundException: No such file or directory: s3a://bucket/cdcv2/data/a02b9653-d715-43a7-8faf-950cbdafebc4-0_123-95450-42475479_20211111205930.parquet
   [2021-11-12 00:40:15,491] {logging_mixin.py:112} INFO - [2021-11-12 00:40:15,491] {batch.py:321} INFO - 	at org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2269)
   ```
   
   Having config based on time will help instead of relying on Number of commits
   cc @vinothchandar 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org