You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/09/13 22:00:02 UTC

[GitHub] [hudi] nsivabalan commented on issue #3077: [SUPPORT] Large latencies in hudi writes using upsert mode.

nsivabalan commented on issue #3077:
URL: https://github.com/apache/hudi/issues/3077#issuecomment-1245994450

   sorry we dropped the ball on this. lets try to make some progress. 
   I re-read entire thread. 
   
   here are my thoughts:
   - what kind of write you are using to write to hudi? is it spark datasource write or deltastreamer writes or spark structured streaming? 
   - probably MOR is not the right approach since you don't seem to have any updates(only 1%). 
   - in COW, we can disable small file handling, but enable clustering to batch small files if any at regular cadence. 
   - also, can we try increasing file size to may be 250Mb. and see how it compares w/ what you already see. 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org