You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Bowen Li (JIRA)" <ji...@apache.org> on 2019/06/14 18:16:00 UTC

[jira] [Comment Edited] (FLINK-12771) Support ConnectorCatalogTable in HiveCatalog

    [ https://issues.apache.org/jira/browse/FLINK-12771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16864321#comment-16864321 ] 

Bowen Li edited comment on FLINK-12771 at 6/14/19 6:15 PM:
-----------------------------------------------------------

Hi [~dawidwys] , here the [spec for Hive temp tables |https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-TemporaryTables]. Basically the temporary objects will be held at Hive metastore client (client side), rather than Hive metastore service (server side)

Re: 2. IMHO, HiveCatalog offers capability of persisting metadata, but we shouldn't limit ourselves to that considering the use cases and experience we want to support.

Re: 3/4. We should put good, clear documentation in place to educate users only the new {{CatalogTable}} can be persisted, as well as loggings for applications and noticeable reminder for interactive consoles. 

Note that this only effect Table API users, not SQL CLI users. Table API users are typically more experienced and advanced Flink users, and it won't be hard for them to learn that, all inline tables are not persistent in 1.8 or older versions, and they will remain so in 1.9 and beyond.
 


was (Author: phoenixjiangnan):
Hi [~dawidwys] , here the [spec for Hive temp tables |https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-TemporaryTables]. Basically the temporary objects will be held at Hive metastore client (client side), rather than Hive metastore service (server side)

Re: 2. IMHO, HiveCatalog offers capability of persisting metadata, but we shouldn't limit ourselves to that considering the use cases and experience we want to support.

Re: 3/4. We should put good, clear documentation in place to educate users only the new {{CatalogTable}} can be persisted, as well as loggings for applications and noticeable reminder for interactive consoles. 

Note that this only effect Table API users, not SQL CLI users. Table API users are often more experienced in Flink, and it won't be hard for them to learn that, all inline tables are not persistent in 1.8 or older versions, and they will remain so in 1.9 and beyond.
 

> Support ConnectorCatalogTable in HiveCatalog
> --------------------------------------------
>
>                 Key: FLINK-12771
>                 URL: https://issues.apache.org/jira/browse/FLINK-12771
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Connectors / Hive
>            Reporter: Bowen Li
>            Assignee: Bowen Li
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.9.0
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently {{HiveCatalog}} does not support {{ConnectorCatalogTable}}. There's a major drawback on this when it comes to real use cases, that is when Table API users set a {{HiveCatalog}} as their default catalog (which is very likely), they cannot create or use any inline table sources/sinks with their default catalog any more. It's really inconvenient for Table API users to use Flink for exploration, experiment, and production.
> There are several workaround in this case. E.g. users have to switch their default catalog, but that misses our original intention of having a default {{HiveCatalog}}; or users can register their inline source/sinks to Flink's default catalog which is a in memory catalog, but that not only require users to type full path of a table but also requires users to be aware of the Flink's default catalog, default db, and their names. In short, none of the workaround seems to be reasonable and user friendly.
> From another perspective, Hive has the concept of temporary tables that are stored in memory of Hive metastore client and are removed when client is shut down. In Flink, {{ConnectorCatalogTable}} can be seen as a type of session-based temporary table, and {{HiveCatalog}} (potentially any catalog implementations) can store it in memory. By introducing the concept of temp table, we could greatly eliminate frictions for users and raise their experience and productivity.
> Thus, we propose adding a simple in memory map for {{ConnectorCatalogTable}} in {{HiveCatalog}} to allow users create and use inline source/sink when their default catalog is a {{HiveCatalog}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)