You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/07/09 03:13:13 UTC

[GitHub] [hudi] vinothchandar commented on issue #1786: [SUPPORT] Bulk insert slow on MOR

vinothchandar commented on issue #1786:
URL: https://github.com/apache/hudi/issues/1786#issuecomment-655871974


   @rvd8345 part of the issue here is the sort we do in bulk_insert to seed the dataset such that files are ordered by keys. This helps later in upsert performance. 
   
   we are working on making these modes configurable in the upcoming release. Cc @yihua 
   
   High level, can you please describe what your workload looks like? We can advise you accordingly. Comparing writing files to hdfs vs bulk_insert to prep a hudi dataset is bit different


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org