You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "winifredtang (JIRA)" <ji...@apache.org> on 2018/10/25 01:46:00 UTC

[jira] [Assigned] (FLINK-5802) Flink SQL calling Hive User-Defined Functions

     [ https://issues.apache.org/jira/browse/FLINK-5802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

winifredtang reassigned FLINK-5802:
-----------------------------------

    Assignee: winifredtang

> Flink SQL calling Hive User-Defined Functions
> ---------------------------------------------
>
>                 Key: FLINK-5802
>                 URL: https://issues.apache.org/jira/browse/FLINK-5802
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Table API &amp; SQL
>            Reporter: Zhuoluo Yang
>            Assignee: winifredtang
>            Priority: Major
>              Labels: features
>
> It's important to call hive udf in Flink SQL. A great many udfs were written in hive since last ten years. 
> It's really important to reuse the hive udfs. This feature will reduce the cost of migration and bring more users to flink.
> Spark SQL has already supported this function.
> https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.0/bk_spark-guide/content/calling-udfs.html
> The Hive UDFs here include both built-in UDFs and customized UDFs. As many business logic had been written in UDFs, the customized UDFs are more important than the built-in UDFs. 
> Generally, there are three kinds of UDFs in Hive: UDF, UDTF and UDAF.
> Here is the document of the Spark SQL: http://spark.apache.org/docs/latest/sql-programming-guide.html#compatibility-with-apache-hive 
> Spark code:
> https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUDFs.scala
> https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveInspectors.scala



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)