You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "alexone95 (via GitHub)" <gi...@apache.org> on 2023/04/21 13:02:22 UTC

[GitHub] [hudi] alexone95 opened a new issue, #8535: [SUPPORT] manually deleting file under .hoodie/archived

alexone95 opened a new issue, #8535:
URL: https://github.com/apache/hudi/issues/8535

   Hi, there could be any problem if i delete the file under .hoodie/archived during the execution of a cdc process running 24/7 over an EMR cluster? or i have to do it when there are not processes up to running?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] alexone95 commented on issue #8535: [SUPPORT] manually deleting file under .hoodie/archived

Posted by "alexone95 (via GitHub)" <gi...@apache.org>.
alexone95 commented on issue #8535:
URL: https://github.com/apache/hudi/issues/8535#issuecomment-1519519787

   thanks for the answer, just one more thing, there will be problem in deleting this files when a process in pyspark is running?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nsivabalan commented on issue #8535: [SUPPORT] manually deleting file under .hoodie/archived

Posted by "nsivabalan (via GitHub)" <gi...@apache.org>.
nsivabalan commented on issue #8535:
URL: https://github.com/apache/hudi/issues/8535#issuecomment-1520655294

   we fixed an issue wrt reading older archived files w/ hive sync. we should not be reading older archived files at any point. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] alexone95 commented on issue #8535: [SUPPORT] manually deleting file under .hoodie/archived

Posted by "alexone95 (via GitHub)" <gi...@apache.org>.
alexone95 commented on issue #8535:
URL: https://github.com/apache/hudi/issues/8535#issuecomment-1520189514

   @ad1happy2go the problem is that i know that hudi periodically read all files under .hoodie/archived directory doing get request to the S3 bucket, there wouldn't be problem related to this?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] alexone95 commented on issue #8535: [SUPPORT] manually deleting file under .hoodie/archived

Posted by "alexone95 (via GitHub)" <gi...@apache.org>.
alexone95 commented on issue #8535:
URL: https://github.com/apache/hudi/issues/8535#issuecomment-1522925284

   @nsivabalan the only problem is that we are using the hudi 12.0.1 version where the the issue is not fixed so we had to do this workaround to avoid this


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nsivabalan commented on issue #8535: [SUPPORT] manually deleting file under .hoodie/archived

Posted by "nsivabalan (via GitHub)" <gi...@apache.org>.
nsivabalan commented on issue #8535:
URL: https://github.com/apache/hudi/issues/8535#issuecomment-1518366401

   files in archived are generally for book-keeping purposes. just incase you wish to go back in time and investigate what happened. but if you really dont' need them, you can delete them.
   you can sort based on last mod time and delete all files except the latest 50 to 100 files may be in the archived folder. I assume you have accrued 1000 or more files and wish to delete them. 
   we do want to add automatic support to delete them if not required. but did not get a chance to add the support


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nsivabalan closed issue #8535: [SUPPORT] manually deleting file under .hoodie/archived

Posted by "nsivabalan (via GitHub)" <gi...@apache.org>.
nsivabalan closed issue #8535: [SUPPORT] manually deleting file under .hoodie/archived
URL: https://github.com/apache/hudi/issues/8535


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] ad1happy2go commented on issue #8535: [SUPPORT] manually deleting file under .hoodie/archived

Posted by "ad1happy2go (via GitHub)" <gi...@apache.org>.
ad1happy2go commented on issue #8535:
URL: https://github.com/apache/hudi/issues/8535#issuecomment-1520167351

   @alexone95 If the code is just using active timeline then deleting it when the process running should not create any issues.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org