You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Dongjoon Hyun (JIRA)" <ji...@apache.org> on 2017/03/06 18:40:32 UTC

[jira] [Comment Edited] (SPARK-18832) Spark SQL: Thriftserver unable to run a registered Hive UDTF

    [ https://issues.apache.org/jira/browse/SPARK-18832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15897799#comment-15897799 ] 

Dongjoon Hyun edited comment on SPARK-18832 at 3/6/17 6:39 PM:
---------------------------------------------------------------

Hi, [~roadster11x].

Thank you for the sample file. I tried the following with your Sample code on Apache Spark 2.0.0. (I removed the package name line from the code just for simplicity.)

*HIVE*
{code}
$ hive
Logging initialized using configuration in jar:file:/usr/local/Cellar/hive12/1.2.1/libexec/lib/hive-common-1.2.1.jar!/hive-log4j.properties
hive> select hello('a');
Added [/Users/dhyun/UDF/a.jar] to class path
Added resources: [/Users/dhyun/UDF/a.jar]
OK
***a***	###a###
Time taken: 1.347 seconds, Fetched: 1 row(s)
{code}

*SPARK THRIFTSERVER*
{code}
$ SPARK_HOME=$PWD sbin/start-thriftserver.sh
starting org.apache.spark.sql.hive.thriftserver.HiveThriftServer2, logging to /Users/dhyun/spark-release/spark-2.0.0-bin-hadoop2.7/logs/spark-dhyun-org.apache.spark.sql.hive.thriftserver.HiveThriftServer2-1-HW13499.local.out

$ bin/beeline -u jdbc:hive2://localhost:10000/default
Connecting to jdbc:hive2://localhost:10000/default
...
Connected to: Spark SQL (version 2.0.0)
Driver: Hive JDBC (version 1.2.1.spark2)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 1.2.1.spark2 by Apache Hive
0: jdbc:hive2://localhost:10000/default> select hello('a');
+----------+----------+--+
|  first   |  second  |
+----------+----------+--+
| ***a***  | ###a###  |
+----------+----------+--+
1 row selected (2.031 seconds)
0: jdbc:hive2://localhost:10000/default> describe function hello;
+--------------------------+--+
|      function_desc       |
+--------------------------+--+
| Function: default.hello  |
| Class: SampleUDTF        |
| Usage: N/A.              |
+--------------------------+--+
3 rows selected (0.041 seconds)
0: jdbc:hive2://localhost:10000/default>
{code}

I'm wondering if Hive work with your function.


was (Author: dongjoon):
Hi, [~roadster11x].

Thank you for the sample file. I tried the following with your Sample code on Apache Spark 2.0.0. (I removed the package name line from the code just for simplicity.)

*HIVE*
{code}
$ hive
Logging initialized using configuration in jar:file:/usr/local/Cellar/hive12/1.2.1/libexec/lib/hive-common-1.2.1.jar!/hive-log4j.properties
hive> select hello('a');
Added [/Users/dhyun/UDF/a.jar] to class path
Added resources: [/Users/dhyun/UDF/a.jar]
OK
***a***	###a###
Time taken: 1.347 seconds, Fetched: 1 row(s)
{code}

*SPARK THRIFTSERVER*
$ SPARK_HOME=$PWD sbin/start-thriftserver.sh
starting org.apache.spark.sql.hive.thriftserver.HiveThriftServer2, logging to /Users/dhyun/spark-release/spark-2.0.0-bin-hadoop2.7/logs/spark-dhyun-org.apache.spark.sql.hive.thriftserver.HiveThriftServer2-1-HW13499.local.out

$ bin/beeline -u jdbc:hive2://localhost:10000/default
Connecting to jdbc:hive2://localhost:10000/default
...
Connected to: Spark SQL (version 2.0.0)
Driver: Hive JDBC (version 1.2.1.spark2)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 1.2.1.spark2 by Apache Hive
0: jdbc:hive2://localhost:10000/default> select hello('a');
+----------+----------+--+
|  first   |  second  |
+----------+----------+--+
| ***a***  | ###a###  |
+----------+----------+--+
1 row selected (2.031 seconds)
0: jdbc:hive2://localhost:10000/default> describe function hello;
+--------------------------+--+
|      function_desc       |
+--------------------------+--+
| Function: default.hello  |
| Class: SampleUDTF        |
| Usage: N/A.              |
+--------------------------+--+
3 rows selected (0.041 seconds)
0: jdbc:hive2://localhost:10000/default>
{code}

I'm wondering if Hive work with your function.

> Spark SQL: Thriftserver unable to run a registered Hive UDTF
> ------------------------------------------------------------
>
>                 Key: SPARK-18832
>                 URL: https://issues.apache.org/jira/browse/SPARK-18832
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.0.0, 2.0.1, 2.0.2
>         Environment: HDP: 2.5
> Spark: 2.0.0
>            Reporter: Lokesh Yadav
>         Attachments: SampleUDTF.java
>
>
> Spark Thriftserver is unable to run a HiveUDTF.
> It throws the error that it is unable to find the functions although the function registration succeeds and the funtions does show up in the list output by {{show functions}}.
> I am using a Hive UDTF, registering it using a jar placed on my local machine. Calling it using the following commands:
> //Registering the functions, this command succeeds.
> {{CREATE FUNCTION SampleUDTF AS 'com.fuzzylogix.experiments.udf.hiveUDF.SampleUDTF' USING JAR '/root/spark_files/experiments-1.2.jar';}}
> //Thriftserver is able to look up the functuion, on this command:
> {{DESCRIBE FUNCTION SampleUDTF;}}
> {quote}
> {noformat}
> Output: 
> +-----------------------------------------------------------+--+
> |                       function_desc                       |
> +-----------------------------------------------------------+--+
> | Function: default.SampleUDTF                              |
> | Class: com.fuzzylogix.experiments.udf.hiveUDF.SampleUDTF  |
> | Usage: N/A.                                               |
> +-----------------------------------------------------------+--+
> {noformat}
> {quote}
> // Calling the function: 
> {{SELECT SampleUDTF('Paris');}}
> bq. Output of the above command: Error: org.apache.spark.sql.AnalysisException: Undefined function: 'SampleUDTF'. This function is neither a registered temporary function nor a permanent function registered in the database 'default'.; line 1 pos 7 (state=,code=0)
> I have also tried with using a non-local (on hdfs) jar, but I get the same error.
> My environment: HDP 2.5 with spark 2.0.0
> I have attached the class file for the UDTF I am using in testing this.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org