You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/10/01 10:09:37 UTC

[GitHub] [hudi] kazdy commented on issue #6832: [SUPPORT] AWS Glue 3.0 fail to write dataset with hudi (hive sync issue)

kazdy commented on issue #6832:
URL: https://github.com/apache/hudi/issues/6832#issuecomment-1264315897

   hi @dragonH 
   
   I had the same issue, basically glue data catalog converts table names and column names to lowarcase. Then in glue client for hive metastore (the aws one) it is not taken into account and because Customer_Sample_Hudi becomes customer_sample_hudi and glue client is case sensitive and throws this error:
   `java.lang.IllegalArgumentException: Partitions must be in the same table`
   
   Here's exactly the piece of code that causes this issue:
   
   `checkArgument(tbl.getDbName().equals(partition.getDbName()), "Partitions must be in the same DB");`
   
   https://github.com/awslabs/aws-glue-data-catalog-client-for-apache-hive-metastore/blob/ac318d42b3df00c1ada1be6a3305bcf9bd4895f0/aws-glue-datacatalog-client-common/src/main/java/com/amazonaws/glue/catalog/metastore/GlueMetastoreClientDelegate.java#L679
   
   You can try using AWSGlueCatalogSyncClient, It might be fixed there. 
   If you don't want to change the sync tool, just change this:
   `'hoodie.datasource.hive_sync.table': 'Customer_Sample_Hudi'`
   to
   `'hoodie.datasource.hive_sync.table': 'customer_sample_hudi'`
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org