You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Michael Malak <mi...@yahoo.com> on 2013/07/31 02:00:31 UTC

UDFs with package names

Thus far, I've been able to create Hive UDFs, but now I need to define them within a Java package name (as opposed to the "default" Java package as I had been doing), but once I do that, I'm no longer able to load them into Hive.

First off, this works:

add jar /usr/lib/hive/lib/hive-contrib-0.10.0-cdh4.3.0.jar;
create temporary function row_sequence as 'org.apache.hadoop.hive.contrib.udf.UDFRowSequence';

Then I took the source code for UDFRowSequence.java from
http://svn.apache.org/repos/asf/hive/trunk/contrib/src/java/org/apache/hadoop/hive/contrib/udf/UDFRowSequence.java

and renamed the file and the class inside to UDFRowSequence2.java

I compile and deploy it with:
javac -cp /usr/lib/hive/lib/hive-exec-0.10.0-cdh4.3.0.jar:/usr/lib/hadoop/hadoop-common.jar UDFRowSequence2.java
jar cvf UDFRowSequence2.jar UDFRowSequence2.class
sudo cp UDFRowSequence2.jar /usr/local/lib


But in Hive, I get the following:
hive>  add jar /usr/local/lib/UDFRowSequence2.jar;
Added /usr/local/lib/UDFRowSequence2.jar to class path
Added resource: /usr/local/lib/UDFRowSequence2.jar
hive> create temporary function row_sequence as 'org.apache.hadoop.hive.contrib.udf.UDFRowSequence2';
FAILED: Class org.apache.hadoop.hive.contrib.udf.UDFRowSequence2 not found
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.FunctionTask

But if I comment out the package line in UDFRowSequence2.java (to put the UDF into the default Java package), it works:
hive>  add jar /usr/local/lib/UDFRowSequence2.jar;
Added /usr/local/lib/UDFRowSequence2.jar to class path
Added resource: /usr/local/lib/UDFRowSequence2.jar
hive> create temporary function row_sequence as 'UDFRowSequence2';
OK
Time taken: 0.383 seconds

What am I doing wrong?  I have a feeling it's something simple.


Re: UDFs with package names

Posted by Michael Malak <mi...@yahoo.com>.
Yup, it was the directory structure com/mystuff/whateverUDF.class that was missing.  Thought I had tried that before posting my question, but...

Thanks for your help!


________________________________
 From: Edward Capriolo <ed...@gmail.com>
To: "user@hive.apache.org" <us...@hive.apache.org>; Michael Malak <mi...@yahoo.com> 
Sent: Tuesday, July 30, 2013 7:06 PM
Subject: Re: UDFs with package names
 


It might be a better idea to use your own package com.mystuff.x. You might be running into an issue where java is not finding the file because it assumes the relation between package and jar is 1 to 1. You might also be compiling wrong If your package is com.mystuff that class file should be in a directory structure com/mystuff/whateverUDF.class I am not seeing that from your example.




On Tue, Jul 30, 2013 at 8:00 PM, Michael Malak <mi...@yahoo.com> wrote:

Thus far, I've been able to create Hive UDFs, but now I need to define them within a Java package name (as opposed to the "default" Java package as I had been doing), but once I do that, I'm no longer able to load them into Hive.
>
>First off, this works:
>
>add jar /usr/lib/hive/lib/hive-contrib-0.10.0-cdh4.3.0.jar;
>create temporary function row_sequence as 'org.apache.hadoop.hive.contrib.udf.UDFRowSequence';
>
>Then I took the source code for UDFRowSequence.java from
>http://svn.apache.org/repos/asf/hive/trunk/contrib/src/java/org/apache/hadoop/hive/contrib/udf/UDFRowSequence.java
>
>and renamed the file and the class inside to UDFRowSequence2.java
>
>I compile and deploy it with:
>javac -cp /usr/lib/hive/lib/hive-exec-0.10.0-cdh4.3.0.jar:/usr/lib/hadoop/hadoop-common.jar UDFRowSequence2.java
>jar cvf UDFRowSequence2.jar UDFRowSequence2.class
>sudo cp UDFRowSequence2.jar /usr/local/lib
>
>
>But in Hive, I get the following:
>hive>  add jar /usr/local/lib/UDFRowSequence2.jar;
>Added /usr/local/lib/UDFRowSequence2.jar to class path
>Added resource: /usr/local/lib/UDFRowSequence2.jar
>hive> create temporary function row_sequence as 'org.apache.hadoop.hive.contrib.udf.UDFRowSequence2';
>FAILED: Class org.apache.hadoop.hive.contrib.udf.UDFRowSequence2 not found
>FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.FunctionTask
>
>But if I comment out the package line in UDFRowSequence2.java (to put the UDF into the default Java package), it works:
>hive>  add jar /usr/local/lib/UDFRowSequence2.jar;
>Added /usr/local/lib/UDFRowSequence2.jar to class path
>Added resource: /usr/local/lib/UDFRowSequence2.jar
>hive> create temporary function row_sequence as 'UDFRowSequence2';
>OK
>Time taken: 0.383 seconds
>
>What am I doing wrong?  I have a feeling it's something simple.
>
>

Re: UDFs with package names

Posted by Edward Capriolo <ed...@gmail.com>.
It might be a better idea to use your own package com.mystuff.x. You might
be running into an issue where java is not finding the file because it
assumes the relation between package and jar is 1 to 1. You might also be
compiling wrong If your package is com.mystuff that class file should be in
a directory structure com/mystuff/whateverUDF.class I am not seeing that
from your example.


On Tue, Jul 30, 2013 at 8:00 PM, Michael Malak <mi...@yahoo.com>wrote:

> Thus far, I've been able to create Hive UDFs, but now I need to define
> them within a Java package name (as opposed to the "default" Java package
> as I had been doing), but once I do that, I'm no longer able to load them
> into Hive.
>
> First off, this works:
>
> add jar /usr/lib/hive/lib/hive-contrib-0.10.0-cdh4.3.0.jar;
> create temporary function row_sequence as
> 'org.apache.hadoop.hive.contrib.udf.UDFRowSequence';
>
> Then I took the source code for UDFRowSequence.java from
>
> http://svn.apache.org/repos/asf/hive/trunk/contrib/src/java/org/apache/hadoop/hive/contrib/udf/UDFRowSequence.java
>
> and renamed the file and the class inside to UDFRowSequence2.java
>
> I compile and deploy it with:
> javac -cp
> /usr/lib/hive/lib/hive-exec-0.10.0-cdh4.3.0.jar:/usr/lib/hadoop/hadoop-common.jar
> UDFRowSequence2.java
> jar cvf UDFRowSequence2.jar UDFRowSequence2.class
> sudo cp UDFRowSequence2.jar /usr/local/lib
>
>
> But in Hive, I get the following:
> hive>  add jar /usr/local/lib/UDFRowSequence2.jar;
> Added /usr/local/lib/UDFRowSequence2.jar to class path
> Added resource: /usr/local/lib/UDFRowSequence2.jar
> hive> create temporary function row_sequence as
> 'org.apache.hadoop.hive.contrib.udf.UDFRowSequence2';
> FAILED: Class org.apache.hadoop.hive.contrib.udf.UDFRowSequence2 not found
> FAILED: Execution Error, return code 1 from
> org.apache.hadoop.hive.ql.exec.FunctionTask
>
> But if I comment out the package line in UDFRowSequence2.java (to put the
> UDF into the default Java package), it works:
> hive>  add jar /usr/local/lib/UDFRowSequence2.jar;
> Added /usr/local/lib/UDFRowSequence2.jar to class path
> Added resource: /usr/local/lib/UDFRowSequence2.jar
> hive> create temporary function row_sequence as 'UDFRowSequence2';
> OK
> Time taken: 0.383 seconds
>
> What am I doing wrong?  I have a feeling it's something simple.
>
>