You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/08/20 02:17:50 UTC

[GitHub] [hudi] umehrot2 commented on issue #1981: [SUPPORT] Huge performance Difference Between Hudi and Regular Parquet in Athena

umehrot2 commented on issue #1981:
URL: https://github.com/apache/hudi/issues/1981#issuecomment-676855516


   @vinothchandar @rubenssoto I am thinking this could just be the difference between presto's performance over regular parquet where it completely uses its native parquet readers, vs presto's performance for Hudi where it needs to atleast use splits/listing logic from Hoodie's Input Format. Is it possible for you to try the queries on an EMR cluster and observe the difference in performance through presto ?
   
   cc @bhasudha as well
   
   @rubenssoto have you tried cutting ticket to AWS support regarding this ? They should help atleast rule out if its something specifically to do with Athena or just performance bottleneck with Hudi.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org