You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/09/27 14:59:02 UTC

[GitHub] [hudi] ranjani1993 commented on issue #6775: [Support] Is HUDI suitable for a usecase with no incremental data from source?

ranjani1993 commented on issue #6775:
URL: https://github.com/apache/hudi/issues/6775#issuecomment-1259635055

   @yihua 
   @yihua 
   Our record key is not ordered. We have used "SIMPLE" index instead of "BLOOM". SIMPLE was giving better performance than BLOOM in our usecase.
   
   **run time statistics:**
   SIMPLE INDEX - 45 mins to update single partition
   BLOOM index - more than 2 hours to update single partition
   HUDI Bulk insert - 15 mins to load single partition
   Regular insert overwrite - 15 mins to load single partition
   
   Can HUDI "upsert" operation provide better performance than HUDI Bulk insert/Regular insert overwrite in this scenario?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org