You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "sivabalan narayanan (Jira)" <ji...@apache.org> on 2021/01/26 20:27:00 UTC

[jira] [Resolved] (HUDI-306) Get to Hudi to support AWS Glue Catalog and other Hive Metastore implementations

     [ https://issues.apache.org/jira/browse/HUDI-306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

sivabalan narayanan resolved HUDI-306.
--------------------------------------
    Fix Version/s: 0.5.2
       Resolution: Fixed

> Get to Hudi to support AWS Glue Catalog and other Hive Metastore implementations
> --------------------------------------------------------------------------------
>
>                 Key: HUDI-306
>                 URL: https://issues.apache.org/jira/browse/HUDI-306
>             Project: Apache Hudi
>          Issue Type: Improvement
>          Components: Hive Integration
>            Reporter: Udit Mehrotra
>            Assignee: Udit Mehrotra
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 0.5.2
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hudi currently does not work with AWS Glue Catalog. The issue/exception it runs into has been reported here as well [issue|https://github.com/apache/incubator-hudi/issues/954] .
> As mentioned in the issue, the reason for this is:
>  * Currently Hudi is interacting with Hive through two different ways:
>  ** Creation of table statement is submitted directly to Hive via JDBC [https://github.com/apache/incubator-hudi/blob/master/hudi-hive/src/main/java/org/apache/hudi/hive/HoodieHiveClient.java#L472] . Thus, Hive will internally create the right metastore client (i.e. Glue if {{*hive.metastore.client.factory.class*}} is set to {{*com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory*}} in hive-site)
>  ** Whereas partition listing among other things are being done by directly calling hive metastore APIs using hive metastore client: [https://github.com/apache/incubator-hudi/blob/master/hudi-hive/src/main/java/org/apache/hudi/hive/HoodieHiveClient.java#L240]
>  * Now in Hudi code, standard specific implementation of the metastore client (not glue metastore client) is being instantiated: [https://github.com/apache/incubator-hudi/blob/master/hudi-hive/src/main/java/org/apache/hudi/hive/HoodieHiveClient.java#L109] .
>  * Ideally this instantiation of metastore client should be left to Hive through [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L5045] for it to consider other implementations of metastore client that might be configured through {{*hive.metastore.client.factory.class*}} .
> That is the reason that table gets created in Glue metastore, but while reading or scanning partitions it is talking to the local hive metastore where it does not find the table created.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)