You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "ad1happy2go (via GitHub)" <gi...@apache.org> on 2023/04/06 08:45:30 UTC

[GitHub] [hudi] ad1happy2go commented on issue #8391: [SUPPORT]hudi-0.13 Using spark to write into Hudi is too slow

ad1happy2go commented on issue #8391:
URL: https://github.com/apache/hudi/issues/8391#issuecomment-1498702556

   @Lujun-WC Its definitely a lot slower and definitely unexpected. Processing just 24 MB of data is taking that much time, Also noticed that task that is reading 23 MB partition is taking 1.1  min and the one reading 48 KB is taking 2.7 min which is quite unexpected. 
   - Can you share the entire code and how much big is existing data size.
   - Can you check how many unique values are there for cdt,data_source ?
   - 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org