You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Xuefu Zhang (JIRA)" <ji...@apache.org> on 2018/10/31 22:54:00 UTC

[jira] [Commented] (FLINK-10729) Create a Hive connector for Hive data access in Flink

    [ https://issues.apache.org/jira/browse/FLINK-10729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670867#comment-16670867 ] 

Xuefu Zhang commented on FLINK-10729:
-------------------------------------

Hi [~ZhenqiuHuang], Sure. Thanks for your interest. Please note this only highlights direction we are going. We may need a design doc for this.

I was thinking of implement a connector in Flink that utilizes Hive's generic InputFormats and OutputFormats. This solves the problem of all data formats in Hive, as compared with solving the data formats one at a time (orc, parquet, etc.). However, my thought is pretty primitive and I need to take a closer look.

Please also share your thoughts on this. Thanks.

> Create a Hive connector for Hive data access in Flink
> -----------------------------------------------------
>
>                 Key: FLINK-10729
>                 URL: https://issues.apache.org/jira/browse/FLINK-10729
>             Project: Flink
>          Issue Type: New Feature
>          Components: Table API &amp; SQL
>    Affects Versions: 1.6.2
>            Reporter: Xuefu Zhang
>            Assignee: Zhenqiu Huang
>            Priority: Major
>
> As part of Flink-Hive integration effort, it's important for Flink to access (read/write) Hive data, which is the responsibility of Hive connector. While there is a HCatalog data connector in the code base, it's not complete (i.e. missing all connector related classes such as validators, etc.). Further, HCatalog interface has many limitations such as accessing a subset of Hive data, supporting a subset of Hive data types, etc. In addition, it's not actively maintained. In fact, it's now only a sub-project in Hive.
> Therefore, here we propose a complete connector set for Hive tables, not via HCatalog, but via direct Hive interface. HCatalog connector will be deprecated.
> Please note that connector on Hive metadata is already covered in other JIRAs, as {{HiveExternalCatalog}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)