You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/10/09 08:03:35 UTC

[GitHub] [hudi] bvaradar edited a comment on issue #2151: [SUPPORT] How to run Periodic Compaction? Multiple Tables - When no Upserts

bvaradar edited a comment on issue #2151:
URL: https://github.com/apache/hudi/issues/2151#issuecomment-706029681


   @tandonraghav : It should work as is with 0.6.0. you should be able to run spark.write() with inline compaction off. Based on compaction schedule, this write will schedule compactions. You can then use your writeClient code to run async compactions.
   
   Just so that you are made aware of all things : Note that inline compaction does not need to run every single time you are ingesting data. You can set it to run every N commits but it will be inline when it runs (blocks writing)
   
   We usually have folks running async compaction in delta-streamer continuous mode and in structured streaming (recently).  Async compaction in spark DF write or in deltastreamer run-once mode is generally not done as users need to setup separate compaction job. Let me open a jira to run compaction alone using spark.write() to make it easier...
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org