You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hive.apache.org by "Arthur.hk.chan@gmail.com" <ar...@gmail.com> on 2014/12/30 04:01:06 UTC

CREATE FUNCTION: How to automatically load extra jar file?

Hi,

I am using Hive 0.13.1 on Hadoop 2.4.1, I need to automatically load an extra JAR file to hive for UDF, below are my steps to create the UDF function. I have tried the following but still no luck to get thru.

Please help!!

Regards
Arthur


Step 1:   (make sure the jar in in HDFS)
hive> dfs -ls hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar;
-rw-r--r--   3 hadoop hadoop      57388 2014-12-30 10:02 hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar

Step 2: (drop if function exists) 
hive> drop function sysdate;                                                  
OK
Time taken: 0.013 seconds

Step 3: (create function using the jar in HDFS)
hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar';
converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
Added /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar to class path
Added resource: /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
OK
Time taken: 0.034 seconds

Step 4: (test)
hive> select sysdate();                                                                                                                                
Automatically selecting local only mode for query
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/hadoop/hbase-0.98.5-hadoop2/lib/phoenix-4.1.0-client-hadoop2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: yarn.nodemanager.loacl-dirs;  Ignoring.
14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
Execution log at: /tmp/hadoop/hadoop_20141230101717_282ec475-8621-40fa-8178-a7927d81540b.log
java.io.FileNotFoundException: File does not exist: hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
	at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1128)
	at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
	at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120)
	at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
	at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
	at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:99)
	at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
	at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:265)
	at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:301)
	at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:389)
	at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
	at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
	at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
	at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
	at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
	at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
	at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
	at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:420)
	at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:740)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Job Submission failed with exception 'java.io.FileNotFoundException(File does not exist: hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar)'
Execution failed with exit status: 1
Obtaining error information
Task failed!
Task ID:
  Stage-1
Logs:
/tmp/hadoop/hive.log
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask


Step 5: (check the file)
hive> dfs -ls /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar;
ls: `/tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar': No such file or directory
Command failed with exit code = 1
Query returned non-zero code: 1, cause: null

Re: CREATE FUNCTION: How to automatically load extra jar file?

Posted by "Arthur.hk.chan@gmail.com" <ar...@gmail.com>.

Hi

I have already placed it in another folder, not the /tmp/ one:

>>> hive> dfs -ls hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar;
>>> -rw-r--r--   3 hadoop hadoop      57388 2014-12-30 10:02 hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar

However, Hive places it to /tmp/ folder during its "CREATE FUNCTION USING JAR"
>>> Step 3: (create function using the jar in HDFS)
>>> hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar';
>>> converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>>> Added /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar to class path
>>> Added resource: /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>>> OK
>>> Time taken: 0.034 seconds


Any ideas how to avoid HIVE uses /tmp/ folder?

Arthur



On 31 Dec, 2014, at 2:27 pm, Nitin Pawar <ni...@gmail.com> wrote:

> If you put a file inside tmp then there is no guarantee it will live there forever based on ur cluster configuration. 
> 
> You may want to put it as a place where all users can access it like making a folder and keeping it read permission 
> 
> On Wed, Dec 31, 2014 at 11:40 AM, Arthur.hk.chan@gmail.com <ar...@gmail.com> wrote:
> 
> Hi,
> 
> Thanks.
> 
> Below are my steps, I did copy my JAR to HDFS and "CREATE FUNCTION  using the JAR in HDFS", however during my smoke test, I got FileNotFoundException.
> 
>>> java.io.FileNotFoundException: File does not exist: hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
> 
> 
> 
>>> Step 1:   (make sure the jar in in HDFS)
>>> hive> dfs -ls hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar;
>>> -rw-r--r--   3 hadoop hadoop      57388 2014-12-30 10:02 hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>>> 
>>> Step 2: (drop if function exists) 
>>> hive> drop function sysdate;                                                  
>>> OK
>>> Time taken: 0.013 seconds
>>> 
>>> Step 3: (create function using the jar in HDFS)
>>> hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar';
>>> converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>>> Added /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar to class path
>>> Added resource: /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>>> OK
>>> Time taken: 0.034 seconds
>>> 
>>> Step 4: (test)
>>> hive> select sysdate(); 
>>> Execution log at: /tmp/hadoop/hadoop_20141230101717_282ec475-8621-40fa-8178-a7927d81540b.log
>>> java.io.FileNotFoundException: File does not exist: hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
> 
> 
> Please help!
> 
> Arthur
> 
> 
> 
> On 31 Dec, 2014, at 12:31 am, Nitin Pawar <ni...@gmail.com> wrote:
> 
>> just copy pasting Jason's reply to other thread 
>> 
>> If you have a recent version of Hive (0.13+), you could try registering your UDF as a "permanent" UDF which was added in HIVE-6047:
>> 
>> 1) Copy your JAR somewhere on HDFS, say hdfs:///home/nirmal/udf/hiveUDF-1.0-SNAPSHOT.jar. 
>> 2) In Hive, run CREATE FUNCTION zeroifnull AS 'com.test.udf.ZeroIfNullUDF' USING JAR 'hdfs:///home/nirmal/udf/hiveUDF-1.0-SNAPSHOT.jar';
>> 
>> The function definition should be saved in the metastore and Hive should remember to pull the JAR from the location you specified in the CREATE FUNCTION call.
>> 
>> On Tue, Dec 30, 2014 at 9:54 PM, Arthur.hk.chan@gmail.com <ar...@gmail.com> wrote:
>> Thank you.
>> 
>> Will this work for hiveserver2 ?
>> 
>> 
>> Arthur
>> 
>> On 30 Dec, 2014, at 2:24 pm, vic0777 <vi...@163.com> wrote:
>> 
>>> 
>>> You can put it into $HOME/.hiverc like this: ADD JAR full_path_of_the_jar. Then, the file is automatically loaded when Hive is started.
>>> 
>>> Wantao
>>> 
>>> 
>>> 
>>> 
>>> At 2014-12-30 11:01:06, "Arthur.hk.chan@gmail.com" <ar...@gmail.com> wrote:
>>> Hi,
>>> 
>>> I am using Hive 0.13.1 on Hadoop 2.4.1, I need to automatically load an extra JAR file to hive for UDF, below are my steps to create the UDF function. I have tried the following but still no luck to get thru.
>>> 
>>> Please help!!
>>> 
>>> Regards
>>> Arthur
>>> 
>>> 
>>> Step 1:   (make sure the jar in in HDFS)
>>> hive> dfs -ls hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar;
>>> -rw-r--r--   3 hadoop hadoop      57388 2014-12-30 10:02 hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>>> 
>>> Step 2: (drop if function exists) 
>>> hive> drop function sysdate;                                                  
>>> OK
>>> Time taken: 0.013 seconds
>>> 
>>> Step 3: (create function using the jar in HDFS)
>>> hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar';
>>> converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>>> Added /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar to class path
>>> Added resource: /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>>> OK
>>> Time taken: 0.034 seconds
>>> 
>>> Step 4: (test)
>>> hive> select sysdate();                                                                                                                                
>>> Automatically selecting local only mode for query
>>> Total jobs = 1
>>> Launching Job 1 out of 1
>>> Number of reduce tasks is set to 0 since there's no reduce operator
>>> SLF4J: Class path contains multiple SLF4J bindings.
>>> SLF4J: Found binding in [jar:file:/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>> SLF4J: Found binding in [jar:file:/hadoop/hbase-0.98.5-hadoop2/lib/phoenix-4.1.0-client-hadoop2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
>>> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
>>> 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
>>> 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: yarn.nodemanager.loacl-dirs;  Ignoring.
>>> 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
>>> Execution log at: /tmp/hadoop/hadoop_20141230101717_282ec475-8621-40fa-8178-a7927d81540b.log
>>> java.io.FileNotFoundException: File does not exist: hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>>> 	at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1128)
>>> 	at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
>>> 	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>>> 	at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120)
>>> 	at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
>>> 	at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
>>> 	at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:99)
>>> 	at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
>>> 	at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:265)
>>> 	at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:301)
>>> 	at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:389)
>>> 	at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
>>> 	at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
>>> 	at java.security.AccessController.doPrivileged(Native Method)
>>> 	at javax.security.auth.Subject.doAs(Subject.java:415)
>>> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>>> 	at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
>>> 	at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
>>> 	at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
>>> 	at java.security.AccessController.doPrivileged(Native Method)
>>> 	at javax.security.auth.Subject.doAs(Subject.java:415)
>>> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>>> 	at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
>>> 	at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
>>> 	at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:420)
>>> 	at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:740)
>>> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> 	at java.lang.reflect.Method.invoke(Method.java:606)
>>> 	at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
>>> Job Submission failed with exception 'java.io.FileNotFoundException(File does not exist: hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar)'
>>> Execution failed with exit status: 1
>>> Obtaining error information
>>> Task failed!
>>> Task ID:
>>>   Stage-1
>>> Logs:
>>> /tmp/hadoop/hive.log
>>> FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>>> 
>>> 
>>> Step 5: (check the file)
>>> hive> dfs -ls /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar;
>>> ls: `/tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar': No such file or directory
>>> Command failed with exit code = 1
>>> Query returned non-zero code: 1, cause: null
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>> 
>> 
>> 
>> 
>> -- 
>> Nitin Pawar
> 
> 
> 
> 
> -- 
> Nitin Pawar

Re: CREATE FUNCTION: How to automatically load extra jar file?

Posted by Nitin Pawar <ni...@gmail.com>.

If you put a file inside tmp then there is no guarantee it will live there
forever based on ur cluster configuration.

You may want to put it as a place where all users can access it like making
a folder and keeping it read permission

On Wed, Dec 31, 2014 at 11:40 AM, Arthur.hk.chan@gmail.com <
arthur.hk.chan@gmail.com> wrote:

>
> Hi,
>
> Thanks.
>
> Below are my steps, I did copy my JAR to HDFS and "CREATE FUNCTION  using
> the JAR in HDFS", however during my smoke test, I got FileNotFoundException.
>
> java.io.FileNotFoundException: File does not exist:
>> hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>>
>>
>
>
> Step 1:   (make sure the jar in in HDFS)
>> hive> dfs -ls hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar;
>> -rw-r--r--   3 hadoop hadoop      57388 2014-12-30 10:02
>> hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>>
>> Step 2: (drop if function exists)
>> hive> drop function sysdate;
>>
>> OK
>> Time taken: 0.013 seconds
>>
>> Step 3: (create function using the jar in HDFS)
>> hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate'
>> using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar';
>> converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>> Added
>> /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>> to class path
>> Added resource:
>> /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>> OK
>> Time taken: 0.034 seconds
>>
>> Step 4: (test)
>> hive> select sysdate();
>>
>> Execution log at:
>> /tmp/hadoop/hadoop_20141230101717_282ec475-8621-40fa-8178-a7927d81540b.log
>> java.io.FileNotFoundException: File does not exist:
>> hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>>
>>
>
> Please help!
>
> Arthur
>
>
>
> On 31 Dec, 2014, at 12:31 am, Nitin Pawar <ni...@gmail.com> wrote:
>
> just copy pasting Jason's reply to other thread
>
> If you have a recent version of Hive (0.13+), you could try registering
> your UDF as a "permanent" UDF which was added in HIVE-6047:
>
> 1) Copy your JAR somewhere on HDFS, say
> hdfs:///home/nirmal/udf/hiveUDF-1.0-SNAPSHOT.jar.
> 2) In Hive, run CREATE FUNCTION zeroifnull AS
> 'com.test.udf.ZeroIfNullUDF' USING JAR '
> hdfs:///home/nirmal/udf/hiveUDF-1.0-SNAPSHOT.jar';
>
> The function definition should be saved in the metastore and Hive should
> remember to pull the JAR from the location you specified in the CREATE
> FUNCTION call.
>
> On Tue, Dec 30, 2014 at 9:54 PM, Arthur.hk.chan@gmail.com <
> arthur.hk.chan@gmail.com> wrote:
>
>> Thank you.
>>
>> Will this work for *hiveserver2 *?
>>
>>
>> Arthur
>>
>> On 30 Dec, 2014, at 2:24 pm, vic0777 <vi...@163.com> wrote:
>>
>>
>> You can put it into $HOME/.hiverc like this: ADD JAR
>> full_path_of_the_jar. Then, the file is automatically loaded when Hive is
>> started.
>>
>> Wantao
>>
>>
>>
>>
>> At 2014-12-30 11:01:06, "Arthur.hk.chan@gmail.com" <
>> arthur.hk.chan@gmail.com> wrote:
>>
>> Hi,
>>
>> I am using Hive 0.13.1 on Hadoop 2.4.1, I need to automatically load an
>> extra JAR file to hive for UDF, below are my steps to create the UDF
>> function. I have tried the following but still no luck to get thru.
>>
>> Please help!!
>>
>> Regards
>> Arthur
>>
>>
>> Step 1:   (make sure the jar in in HDFS)
>> hive> dfs -ls hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar;
>> -rw-r--r--   3 hadoop hadoop      57388 2014-12-30 10:02
>> hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>>
>> Step 2: (drop if function exists)
>> hive> drop function sysdate;
>>
>> OK
>> Time taken: 0.013 seconds
>>
>> Step 3: (create function using the jar in HDFS)
>> hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate'
>> using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar';
>> converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>> Added
>> /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>> to class path
>> Added resource:
>> /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>> OK
>> Time taken: 0.034 seconds
>>
>> Step 4: (test)
>> hive> select sysdate();
>>
>>
>> Automatically selecting local only mode for query
>> Total jobs = 1
>> Launching Job 1 out of 1
>> Number of reduce tasks is set to 0 since there's no reduce operator
>> SLF4J: Class path contains multiple SLF4J bindings.
>> SLF4J: Found binding in
>> [jar:file:/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>> SLF4J: Found binding in
>> [jar:file:/hadoop/hbase-0.98.5-hadoop2/lib/phoenix-4.1.0-client-hadoop2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
>> explanation.
>> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
>> 14/12/30 10:17:06 WARN conf.Configuration:
>> file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an
>> attempt to override final parameter:
>> mapreduce.job.end-notification.max.retry.interval;  Ignoring.
>> 14/12/30 10:17:06 WARN conf.Configuration:
>> file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an
>> attempt to override final parameter: yarn.nodemanager.loacl-dirs;  Ignoring.
>> 14/12/30 10:17:06 WARN conf.Configuration:
>> file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an
>> attempt to override final parameter:
>> mapreduce.job.end-notification.max.attempts;  Ignoring.
>> Execution log at:
>> /tmp/hadoop/hadoop_20141230101717_282ec475-8621-40fa-8178-a7927d81540b.log
>> java.io.FileNotFoundException: File does not exist:
>> hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>> at
>> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1128)
>> at
>> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
>> at
>> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>> at
>> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120)
>> at
>> org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
>> at
>> org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
>> at
>> org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:99)
>> at
>> org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
>> at
>> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:265)
>> at
>> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:301)
>> at
>> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:389)
>> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
>> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at javax.security.auth.Subject.doAs(Subject.java:415)
>> at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
>> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
>> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at javax.security.auth.Subject.doAs(Subject.java:415)
>> at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>> at
>> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
>> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
>> at
>> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:420)
>> at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:740)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:606)
>> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
>> Job Submission failed with exception 'java.io.FileNotFoundException(File
>> does not exist:
>> hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>> )'
>> Execution failed with exit status: 1
>> Obtaining error information
>> Task failed!
>> Task ID:
>>   Stage-1
>> Logs:
>> /tmp/hadoop/hive.log
>> FAILED: Execution Error, return code 1 from
>> org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>>
>>
>> Step 5: (check the file)
>> hive> dfs -ls
>> /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar;
>> ls: `/tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar':
>> No such file or directory
>> Command failed with exit code = 1
>> Query returned non-zero code: 1, cause: null
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
>
> --
> Nitin Pawar
>
>
>


-- 
Nitin Pawar

Re: CREATE FUNCTION: How to automatically load extra jar file?

Posted by "Arthur.hk.chan@gmail.com" <ar...@gmail.com>.

Hi,

Thanks.

Below are my steps, I did copy my JAR to HDFS and "CREATE FUNCTION  using the JAR in HDFS", however during my smoke test, I got FileNotFoundException.

>> java.io.FileNotFoundException: File does not exist: hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar



>> Step 1:   (make sure the jar in in HDFS)
>> hive> dfs -ls hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar;
>> -rw-r--r--   3 hadoop hadoop      57388 2014-12-30 10:02 hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>> 
>> Step 2: (drop if function exists) 
>> hive> drop function sysdate;                                                  
>> OK
>> Time taken: 0.013 seconds
>> 
>> Step 3: (create function using the jar in HDFS)
>> hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar';
>> converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>> Added /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar to class path
>> Added resource: /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>> OK
>> Time taken: 0.034 seconds
>> 
>> Step 4: (test)
>> hive> select sysdate(); 
>> Execution log at: /tmp/hadoop/hadoop_20141230101717_282ec475-8621-40fa-8178-a7927d81540b.log
>> java.io.FileNotFoundException: File does not exist: hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar


Please help!

Arthur



On 31 Dec, 2014, at 12:31 am, Nitin Pawar <ni...@gmail.com> wrote:

> just copy pasting Jason's reply to other thread 
> 
> If you have a recent version of Hive (0.13+), you could try registering your UDF as a "permanent" UDF which was added in HIVE-6047:
> 
> 1) Copy your JAR somewhere on HDFS, say hdfs:///home/nirmal/udf/hiveUDF-1.0-SNAPSHOT.jar. 
> 2) In Hive, run CREATE FUNCTION zeroifnull AS 'com.test.udf.ZeroIfNullUDF' USING JAR 'hdfs:///home/nirmal/udf/hiveUDF-1.0-SNAPSHOT.jar';
> 
> The function definition should be saved in the metastore and Hive should remember to pull the JAR from the location you specified in the CREATE FUNCTION call.
> 
> On Tue, Dec 30, 2014 at 9:54 PM, Arthur.hk.chan@gmail.com <ar...@gmail.com> wrote:
> Thank you.
> 
> Will this work for hiveserver2 ?
> 
> 
> Arthur
> 
> On 30 Dec, 2014, at 2:24 pm, vic0777 <vi...@163.com> wrote:
> 
>> 
>> You can put it into $HOME/.hiverc like this: ADD JAR full_path_of_the_jar. Then, the file is automatically loaded when Hive is started.
>> 
>> Wantao
>> 
>> 
>> 
>> 
>> At 2014-12-30 11:01:06, "Arthur.hk.chan@gmail.com" <ar...@gmail.com> wrote:
>> Hi,
>> 
>> I am using Hive 0.13.1 on Hadoop 2.4.1, I need to automatically load an extra JAR file to hive for UDF, below are my steps to create the UDF function. I have tried the following but still no luck to get thru.
>> 
>> Please help!!
>> 
>> Regards
>> Arthur
>> 
>> 
>> Step 1:   (make sure the jar in in HDFS)
>> hive> dfs -ls hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar;
>> -rw-r--r--   3 hadoop hadoop      57388 2014-12-30 10:02 hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>> 
>> Step 2: (drop if function exists) 
>> hive> drop function sysdate;                                                  
>> OK
>> Time taken: 0.013 seconds
>> 
>> Step 3: (create function using the jar in HDFS)
>> hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar';
>> converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>> Added /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar to class path
>> Added resource: /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>> OK
>> Time taken: 0.034 seconds
>> 
>> Step 4: (test)
>> hive> select sysdate();                                                                                                                                
>> Automatically selecting local only mode for query
>> Total jobs = 1
>> Launching Job 1 out of 1
>> Number of reduce tasks is set to 0 since there's no reduce operator
>> SLF4J: Class path contains multiple SLF4J bindings.
>> SLF4J: Found binding in [jar:file:/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>> SLF4J: Found binding in [jar:file:/hadoop/hbase-0.98.5-hadoop2/lib/phoenix-4.1.0-client-hadoop2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
>> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
>> 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
>> 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: yarn.nodemanager.loacl-dirs;  Ignoring.
>> 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
>> Execution log at: /tmp/hadoop/hadoop_20141230101717_282ec475-8621-40fa-8178-a7927d81540b.log
>> java.io.FileNotFoundException: File does not exist: hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>> 	at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1128)
>> 	at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
>> 	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>> 	at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120)
>> 	at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
>> 	at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
>> 	at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:99)
>> 	at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
>> 	at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:265)
>> 	at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:301)
>> 	at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:389)
>> 	at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
>> 	at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
>> 	at java.security.AccessController.doPrivileged(Native Method)
>> 	at javax.security.auth.Subject.doAs(Subject.java:415)
>> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>> 	at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
>> 	at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
>> 	at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
>> 	at java.security.AccessController.doPrivileged(Native Method)
>> 	at javax.security.auth.Subject.doAs(Subject.java:415)
>> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>> 	at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
>> 	at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
>> 	at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:420)
>> 	at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:740)
>> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> 	at java.lang.reflect.Method.invoke(Method.java:606)
>> 	at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
>> Job Submission failed with exception 'java.io.FileNotFoundException(File does not exist: hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar)'
>> Execution failed with exit status: 1
>> Obtaining error information
>> Task failed!
>> Task ID:
>>   Stage-1
>> Logs:
>> /tmp/hadoop/hive.log
>> FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>> 
>> 
>> Step 5: (check the file)
>> hive> dfs -ls /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar;
>> ls: `/tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar': No such file or directory
>> Command failed with exit code = 1
>> Query returned non-zero code: 1, cause: null
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
> 
> 
> 
> 
> -- 
> Nitin Pawar

Re: CREATE FUNCTION: How to automatically load extra jar file?

Posted by Nitin Pawar <ni...@gmail.com>.

just copy pasting Jason's reply to other thread

If you have a recent version of Hive (0.13+), you could try registering
your UDF as a "permanent" UDF which was added in HIVE-6047:

1) Copy your JAR somewhere on HDFS, say
hdfs:///home/nirmal/udf/hiveUDF-1.0-SNAPSHOT.jar.
2) In Hive, run CREATE FUNCTION zeroifnull AS 'com.test.udf.ZeroIfNullUDF'
USING JAR 'hdfs:///home/nirmal/udf/hiveUDF-1.0-SNAPSHOT.jar';

The function definition should be saved in the metastore and Hive should
remember to pull the JAR from the location you specified in the CREATE
FUNCTION call.

On Tue, Dec 30, 2014 at 9:54 PM, Arthur.hk.chan@gmail.com <
arthur.hk.chan@gmail.com> wrote:

> Thank you.
>
> Will this work for *hiveserver2 *?
>
>
> Arthur
>
> On 30 Dec, 2014, at 2:24 pm, vic0777 <vi...@163.com> wrote:
>
>
> You can put it into $HOME/.hiverc like this: ADD JAR full_path_of_the_jar.
> Then, the file is automatically loaded when Hive is started.
>
> Wantao
>
>
>
>
> At 2014-12-30 11:01:06, "Arthur.hk.chan@gmail.com" <
> arthur.hk.chan@gmail.com> wrote:
>
> Hi,
>
> I am using Hive 0.13.1 on Hadoop 2.4.1, I need to automatically load an
> extra JAR file to hive for UDF, below are my steps to create the UDF
> function. I have tried the following but still no luck to get thru.
>
> Please help!!
>
> Regards
> Arthur
>
>
> Step 1:   (make sure the jar in in HDFS)
> hive> dfs -ls hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar;
> -rw-r--r--   3 hadoop hadoop      57388 2014-12-30 10:02
> hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>
> Step 2: (drop if function exists)
> hive> drop function sysdate;
>
> OK
> Time taken: 0.013 seconds
>
> Step 3: (create function using the jar in HDFS)
> hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate'
> using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar';
> converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
> Added
> /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
> to class path
> Added resource:
> /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
> OK
> Time taken: 0.034 seconds
>
> Step 4: (test)
> hive> select sysdate();
>
>
> Automatically selecting local only mode for query
> Total jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks is set to 0 since there's no reduce operator
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in
> [jar:file:/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in
> [jar:file:/hadoop/hbase-0.98.5-hadoop2/lib/phoenix-4.1.0-client-hadoop2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 14/12/30 10:17:06 WARN conf.Configuration:
> file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an
> attempt to override final parameter:
> mapreduce.job.end-notification.max.retry.interval;  Ignoring.
> 14/12/30 10:17:06 WARN conf.Configuration:
> file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an
> attempt to override final parameter: yarn.nodemanager.loacl-dirs;  Ignoring.
> 14/12/30 10:17:06 WARN conf.Configuration:
> file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an
> attempt to override final parameter:
> mapreduce.job.end-notification.max.attempts;  Ignoring.
> Execution log at:
> /tmp/hadoop/hadoop_20141230101717_282ec475-8621-40fa-8178-a7927d81540b.log
> java.io.FileNotFoundException: File does not exist:
> hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1128)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
> at
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120)
> at
> org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
> at
> org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
> at
> org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:99)
> at
> org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
> at
> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:265)
> at
> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:301)
> at
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:389)
> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
> at
> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:420)
> at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:740)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> Job Submission failed with exception 'java.io.FileNotFoundException(File
> does not exist:
> hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
> )'
> Execution failed with exit status: 1
> Obtaining error information
> Task failed!
> Task ID:
>   Stage-1
> Logs:
> /tmp/hadoop/hive.log
> FAILED: Execution Error, return code 1 from
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>
>
> Step 5: (check the file)
> hive> dfs -ls
> /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar;
> ls: `/tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar':
> No such file or directory
> Command failed with exit code = 1
> Query returned non-zero code: 1, cause: null
>
>
>
>
>
>
>
>
>
>
>


-- 
Nitin Pawar

Re: CREATE FUNCTION: How to automatically load extra jar file?

Posted by Jason Dere <jd...@hortonworks.com>.

Can you search your hive.log file after step (4) for "_resources", or "adding libjars:"?
Still a bit surprised you don't see /tmp/abce45b1-6041-40b6-83ed-8c6491216360_resources/nexr.jar on the local file system .. this Hive CLI right? And steps 1-4 all in the same Hive CLI session? Are you looking for /tmp/abce45b1-6041-40b6-83ed-8c6491216360_resources/nexr.jar after closing the HiveCLI, or while it's still running?


On Jan 14, 2015, at 11:34 PM, "Arthur.hk.chan@gmail.com" <ar...@gmail.com> wrote:

> Hi,
>  
> I have deleted the original hive metadata database from mysql,  re-created a new one with "character set ='latin1';
> also put the jar file into HDFS with a shorter file name, the  'max key length is 767 bytes’ issue from mysql is resolved.
> 
> 
> Tried again:
> 1) drop function sysdate;
> 2) CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' using JAR 'hdfs://mycluster/hadoop/nexr.jar';
> 
> 3) (check the hive log)
> 2015-01-15 15:05:43,133 INFO  [main]: ql.Driver (Driver.java:getSchema(238)) - Returning Hive schema: Schema(fieldSchemas:null, properties:null)
> 2015-01-15 15:05:43,133 INFO  [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=compile start=1421305543126 end=1421305543133 duration=7 from=org.apache.hadoop.hive.ql.Driver>
> 2015-01-15 15:05:43,133 INFO  [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) - <PERFLOG method=Driver.execute from=org.apache.hadoop.hive.ql.Driver>
> 2015-01-15 15:05:43,133 INFO  [main]: ql.Driver (Driver.java:execute(1192)) - Starting command: CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' using JAR 'hdfs://mycluster/hadoop/nexr.jar'
> 2015-01-15 15:05:43,134 INFO  [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=TimeToSubmit start=1421305543125 end=1421305543134 duration=9 from=org.apache.hadoop.hive.ql.Driver>
> 2015-01-15 15:05:43,134 INFO  [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) - <PERFLOG method=runTasks from=org.apache.hadoop.hive.ql.Driver>
> 2015-01-15 15:05:43,134 INFO  [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) - <PERFLOG method=task.FUNCTION.Stage-0 from=org.apache.hadoop.hive.ql.Driver>
> 2015-01-15 15:05:43,135 INFO  [main]: SessionState (SessionState.java:printInfo(536)) - converting to local  hdfs://mycluster/hadoop/nexr.jar
> 2015-01-15 15:05:43,142 INFO  [main]: SessionState (SessionState.java:printInfo(536)) - Added /tmp/606e6a26-775f-40c8-be18-a670deee2f7e_resources/nexr.jar to class path
> 2015-01-15 15:05:43,142 INFO  [main]: SessionState (SessionState.java:printInfo(536)) - Added resource: /tmp/606e6a26-775f-40c8-be18-a670deee2f7e_resources/nexr.jar
> 
> 4) select sysdate();                                                                                                    
> converting to local hdfs://mycluster/hadoop/nexr.jar
> Added /tmp/abce45b1-6041-40b6-83ed-8c6491216360_resources/nexr.jar to class path
> Added resource: /tmp/abce45b1-6041-40b6-83ed-8c6491216360_resources/nexr.jar
> Automatically selecting local only mode for query
> Total jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks is set to 0 since there's no reduce operator
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> java.io.FileNotFoundException: File does not exist: hdfs://mycluster/tmp/abce45b1-6041-40b6-83ed-8c6491216360_resources/nexr.jar
> 	at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1128)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
> 	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120)
> 	at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
> 	at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
> 	at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:93)
> 	at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
> 	at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:265)
> 	at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:301)
> 	at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:389)
> 	at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
> 	at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
> 	at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
> 	at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
> 	at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
> 	at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
> 	at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
> 	at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:420)
> 	at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:740)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:606)
> 	at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> Job Submission failed with exception 'java.io.FileNotFoundException(File does not exist: hdfs://mycluster/tmp/abce45b1-6041-40b6-83ed-8c6491216360_resources/nexr.jar)'
> Execution failed with exit status: 1
> Obtaining error information
> Task failed!
> 
> 5) I cannot find the "abce45b1-6041-40b6-83ed-8c6491216360_resources/nexr.jar” in "hdfs://mycluster/tmp/abce45b1-6041-40b6-83ed-8c6491216360_resources/nexr.jar”,
> also not found in local /tmp/ folder.
> 
> 
> I think it should be the case here that “libjars setting for the map/reduce job is somehow getting sent without the "file:///", which might be causing hadoop to interpret the path as a HDFS path rather than a local path."
> 
> Is there a way to verify my libjars setting for map/reduce job?
> 
> Please help!
> Regards
> Arthur
> 
> 
> 
> 
> On 11 Jan, 2015, at 5:35 pm, Arthur.hk.chan@gmail.com <ar...@gmail.com> wrote:
> 
>> Hi,
>> 
>> 
>> 
>> mysql> show variables like "character_set_database";
>> +------------------------+--------+
>> | Variable_name          | Value  |
>> +------------------------+--------+
>> | character_set_database | latin1 |
>> +------------------------+--------+
>> 1 row in set (0.00 sec)
>> 
>> mysql> show variables like "collation_database";
>> +--------------------+-------------------+
>> | Variable_name      | Value             |
>> +--------------------+-------------------+
>> | collation_database | latin1_swedish_ci |
>> +--------------------+-------------------+
>> 1 row in set (0.00 sec)
>> 
>> 
>> 
>> 2015-01-11 17:21:07,835 ERROR [main]: DataNucleus.Datastore (Log4JLogger.java:error(115)) - An exception was thrown while adding/validating class(es) : Specified key was too long; max key length is 767 bytes
>> com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Specified key was too long; max key length is 767 bytes
>> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>> 	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>> 	at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>> 	at com.mysql.jdbc.Util.handleNewInstance(Util.java:408)
>> 	at com.mysql.jdbc.Util.getInstance(Util.java:383)
>> 	at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1062)
>> 	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4226)
>> 	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4158)
>> 	at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2615)
>> 	at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2776)
>> 	at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2834)
>> 	at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2783)
>> 	at com.mysql.jdbc.StatementImpl.execute(StatementImpl.java:908)
>> 	at com.mysql.jdbc.StatementImpl.execute(StatementImpl.java:788)
>> 	at com.jolbox.bonecp.StatementHandle.execute(StatementHandle.java:254)
>> 	at org.datanucleus.store.rdbms.table.AbstractTable.executeDdlStatement(AbstractTable.java:760)
>> 	at org.datanucleus.store.rdbms.table.TableImpl.createIndices(TableImpl.java:648)
>> 	at org.datanucleus.store.rdbms.table.TableImpl.validateIndices(TableImpl.java:593)
>> 	at org.datanucleus.store.rdbms.table.TableImpl.validateConstraints(TableImpl.java:390)
>> 	at org.datanucleus.store.rdbms.table.ClassTable.validateConstraints(ClassTable.java:3463)
>> 	at org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.performTablesValidation(RDBMSStoreManager.java:3464)
>> 	at org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.addClassTablesAndValidate(RDBMSStoreManager.java:3190)
>> 	at org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.run(RDBMSStoreManager.java:2841)
>> 	at org.datanucleus.store.rdbms.AbstractSchemaTransaction.execute(AbstractSchemaTransaction.java:122)
>> 	at org.datanucleus.store.rdbms.RDBMSStoreManager.addClasses(RDBMSStoreManager.java:1605)
>> 	at org.datanucleus.store.AbstractStoreManager.addClass(AbstractStoreManager.java:954)
>> 	at org.datanucleus.store.rdbms.RDBMSStoreManager.getDatastoreClass(RDBMSStoreManager.java:679)
>> 	at org.datanucleus.store.rdbms.query.RDBMSQueryUtils.getStatementForCandidates(RDBMSQueryUtils.java:408)
>> 	at org.datanucleus.store.rdbms.query.JDOQLQuery.compileQueryFull(JDOQLQuery.java:947)
>> 	at org.datanucleus.store.rdbms.query.JDOQLQuery.compileInternal(JDOQLQuery.java:370)
>> 	at org.datanucleus.store.query.Query.executeQuery(Query.java:1744)
>> 	at org.datanucleus.store.query.Query.executeWithArray(Query.java:1672)
>> 	at org.datanucleus.store.query.Query.execute(Query.java:1654)
>> 	at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:221)
>> 	at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.<init>(MetaStoreDirectSql.java:121)
>> 	at org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:252)
>> 	at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:223)
>> 	at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
>> 	at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
>> 	at org.apache.hadoop.hive.metastore.RawStoreProxy.<init>(RawStoreProxy.java:58)
>> 	at org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:67)
>> 	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:497)
>> 	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:475)
>> 	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:523)
>> 	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:397)
>> 	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.<init>(HiveMetaStore.java:356)
>> 	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:54)
>> 	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:59)
>> 	at org.apache.hadoop.hive.metastore.HiveMetaStore.newHMSHandler(HiveMetaStore.java:4944)
>> 	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:171)
>> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>> 	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>> 	at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>> 	at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1410)
>> 	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:62)
>> 	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:72)
>> 	at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2453)
>> 	at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2465)
>> 	at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:340)
>> 	at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
>> 	at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
>> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> 	at java.lang.reflect.Method.invoke(Method.java:606)
>> 	at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
>> 
>> 2015-01-11 17:21:07,835 ERROR [main]: DataNucleus.Datastore (Log4JLogger.java:error(115)) - An exception was thrown while adding/validating class(es) : Specified key was too long; max key length is 767 bytes
>> com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Specified key was too long; max key length is 767 bytes
>> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>> 	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>> 	at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>> 	at com.mysql.jdbc.Util.handleNewInstance(Util.java:408)
>> 	at com.mysql.jdbc.Util.getInstance(Util.java:383)
>> 	at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1062)
>> 	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4226)
>> 	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4158)
>> 	at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2615)
>> 	at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2776)
>> 	at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2834)
>> 	at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2783)
>> 	at com.mysql.jdbc.StatementImpl.execute(StatementImpl.java:908)
>> 	at com.mysql.jdbc.StatementImpl.execute(StatementImpl.java:788)
>> 	at com.jolbox.bonecp.StatementHandle.execute(StatementHandle.java:254)
>> 	at org.datanucleus.store.rdbms.table.AbstractTable.executeDdlStatement(AbstractTable.java:760)
>> 	at org.datanucleus.store.rdbms.table.TableImpl.createIndices(TableImpl.java:648)
>> 	at org.datanucleus.store.rdbms.table.TableImpl.validateIndices(TableImpl.java:593)
>> 	at org.datanucleus.store.rdbms.table.TableImpl.validateConstraints(TableImpl.java:390)
>> 	at org.datanucleus.store.rdbms.table.ClassTable.validateConstraints(ClassTable.java:3463)
>> 	at org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.performTablesValidation(RDBMSStoreManager.java:3464)
>> 	at org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.addClassTablesAndValidate(RDBMSStoreManager.java:3190)
>> 	at org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.run(RDBMSStoreManager.java:2841)
>> 	at org.datanucleus.store.rdbms.AbstractSchemaTransaction.execute(AbstractSchemaTransaction.java:122)
>> 	at org.datanucleus.store.rdbms.RDBMSStoreManager.addClasses(RDBMSStoreManager.java:1605)
>> 	at org.datanucleus.store.AbstractStoreManager.addClass(AbstractStoreManager.java:954)
>> 	at org.datanucleus.store.rdbms.RDBMSStoreManager.getDatastoreClass(RDBMSStoreManager.java:679)
>> 	at org.datanucleus.store.rdbms.query.RDBMSQueryUtils.getStatementForCandidates(RDBMSQueryUtils.java:408)
>> 	at org.datanucleus.store.rdbms.query.JDOQLQuery.compileQueryFull(JDOQLQuery.java:947)
>> 	at org.datanucleus.store.rdbms.query.JDOQLQuery.compileInternal(JDOQLQuery.java:370)
>> 	at org.datanucleus.store.query.Query.executeQuery(Query.java:1744)
>> 	at org.datanucleus.store.query.Query.executeWithArray(Query.java:1672)
>> 	at org.datanucleus.store.query.Query.execute(Query.java:1654)
>> 	at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:221)
>> 	at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.<init>(MetaStoreDirectSql.java:121)
>> 	at org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:252)
>> 	at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:223)
>> 	at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
>> 	at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
>> 	at org.apache.hadoop.hive.metastore.RawStoreProxy.<init>(RawStoreProxy.java:58)
>> 	at org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:67)
>> 	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:497)
>> 	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:475)
>> 	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:523)
>> 	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:397)
>> 	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.<init>(HiveMetaStore.java:356)
>> 	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:54)
>> 	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:59)
>> 	at org.apache.hadoop.hive.metastore.HiveMetaStore.newHMSHandler(HiveMetaStore.java:4944)
>> 	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:171)
>> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>> 	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>> 	at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>> 	at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1410)
>> 	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:62)
>> 	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:72)
>> 	at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2453)
>> 	at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2465)
>> 	at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:340)
>> 	at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
>> 	at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
>> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> 	at java.lang.reflect.Method.invoke(Method.java:606)
>> 	at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
>> 
>> 
>> What would be wrong?
>> Regards
>> Arthur
>> 
>> 
>> On 11 Jan, 2015, at 5:18 pm, Arthur.hk.chan@gmail.com <ar...@gmail.com> wrote:
>> 
>>> Hi,
>>> 
>>> 
>>> 2015-01-04 08:57:12,154 ERROR [main]: DataNucleus.Datastore (Log4JLogger.java:error(115)) - An exception was thrown while adding/validating class(es) : Specified key was too long; max key length is 767 bytes
>>> com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Specified key was too long; max key length is 767 bytes
>>> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>>> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>>> 	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>>> 	at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>>> 	at com.mysql.jdbc.Util.handleNewInstance(Util.java:408)
>>> 	at com.mysql.jdbc.Util.getInstance(Util.java:383)
>>> 	at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1062)
>>> 	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4226)
>>> 	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4158)
>>> 	at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2615)
>>> 	at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2776)
>>> 	at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2834)
>>> 	at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2783)
>>> 	at com.mysql.jdbc.StatementImpl.execute(StatementImpl.java:908)
>>> 	at com.mysql.jdbc.StatementImpl.execute(StatementImpl.java:788)
>>> 	at com.jolbox.bonecp.StatementHandle.execute(StatementHandle.java:254)
>>> 	at org.datanucleus.store.rdbms.table.AbstractTable.executeDdlStatement(AbstractTable.java:760)
>>> 	at org.datanucleus.store.rdbms.table.TableImpl.createIndices(TableImpl.java:648)
>>> 	at org.datanucleus.store.rdbms.table.TableImpl.validateIndices(TableImpl.java:593)
>>> 	at org.datanucleus.store.rdbms.table.TableImpl.validateConstraints(TableImpl.java:390)
>>> 	at org.datanucleus.store.rdbms.table.ClassTable.validateConstraints(ClassTable.java:3463)
>>> 	at org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.performTablesValidation(RDBMSStoreManager.java:3464)
>>> 	at org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.addClassTablesAndValidate(RDBMSStoreManager.java:3190)
>>> 	at org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.run(RDBMSStoreManager.java:2841)
>>> 	at org.datanucleus.store.rdbms.AbstractSchemaTransaction.execute(AbstractSchemaTransaction.java:122)
>>> 	at org.datanucleus.store.rdbms.RDBMSStoreManager.addClasses(RDBMSStoreManager.java:1605)
>>> 	at org.datanucleus.store.AbstractStoreManager.addClass(AbstractStoreManager.java:954)
>>> 	at org.datanucleus.store.rdbms.RDBMSStoreManager.getDatastoreClass(RDBMSStoreManager.java:679)
>>> 	at org.datanucleus.store.rdbms.query.RDBMSQueryUtils.getStatementForCandidates(RDBMSQueryUtils.java:408)
>>> 	at org.datanucleus.store.rdbms.query.JDOQLQuery.compileQueryFull(JDOQLQuery.java:947)
>>> 	at org.datanucleus.store.rdbms.query.JDOQLQuery.compileInternal(JDOQLQuery.java:370)
>>> 	at org.datanucleus.store.query.Query.executeQuery(Query.java:1744)
>>> 	at org.datanucleus.store.query.Query.executeWithArray(Query.java:1672)
>>> 	at org.datanucleus.store.query.Query.execute(Query.java:1654)
>>> 	at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:221)
>>> 	at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.<init>(MetaStoreDirectSql.java:121)
>>> 	at org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:252)
>>> 	at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:223)
>>> 	at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
>>> 	at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
>>> 	at org.apache.hadoop.hive.metastore.RawStoreProxy.<init>(RawStoreProxy.java:58)
>>> 	at org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:67)
>>> 	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:497)
>>> 	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:475)
>>> 	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_all_databases(HiveMetaStore.java:1026)
>>> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> 	at java.lang.reflect.Method.invoke(Method.java:606)
>>> 	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
>>> 	at com.sun.proxy.$Proxy10.get_all_databases(Unknown Source)
>>> 	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getAllDatabases(HiveMetaStoreClient.java:837)
>>> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> 	at java.lang.reflect.Method.invoke(Method.java:606)
>>> 	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
>>> 	at com.sun.proxy.$Proxy11.getAllDatabases(Unknown Source)
>>> 	at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1098)
>>> 	at org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionNames(FunctionRegistry.java:671)
>>> 	at org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionNames(FunctionRegistry.java:662)
>>> 	at org.apache.hadoop.hive.cli.CliDriver.getCommandCompletor(CliDriver.java:540)
>>> 	at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:758)
>>> 	at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)
>>> 	at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
>>> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> 	at java.lang.reflect.Method.invoke(Method.java:606)
>>> 	at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
>>> 
>>> Regards
>>> Arthur
>>> 
>>> 
>>> 
>>> On 7 Jan, 2015, at 7:22 am, Jason Dere <jd...@hortonworks.com> wrote:
>>> 
>>>> Does your hive.log contain any lines with "adding libjars:"?
>>>> 
>>>> May also search for any lines containing "_resources", would like to see the result of both searches.
>>>> 
>>>> For example, mine is showing the following line:
>>>> 2015-01-06 14:53:28,115 INFO  mr.ExecDriver (ExecDriver.java:execute(307)) - adding libjars: file:///tmp/d0ed1585-d9e6-4944-b985-225351574de0_resources/spatial-sdk-hive-1.0.3-SNAPSHOT.jar,file:///tmp/d0ed1585-d9e6-4944-b985-225351574de0_resources/esri-geometry-api.jar
>>>> 
>>>> I wonder if your libjars setting for the map/reduce job is somehow getting sent without the "file:///", which might be causing hadoop to interpret the path as a HDFS path rather than a local path.
>>>> 
>>>> On Jan 6, 2015, at 1:11 AM, Arthur.hk.chan <ar...@gmail.com> wrote:
>>>> 
>>>>> Hi,
>>>>> 
>>>>> my hadoop’s core-site.xml contains following about tmp
>>>>> 
>>>>> <property>
>>>>>   <name>hadoop.tmp.dir</name>
>>>>>   <value>/hadoop_data/hadoop_data/tmp</value>
>>>>> </property>
>>>>> 
>>>>> 
>>>>> 
>>>>> my hive-default.xml contains following about tmp
>>>>> 
>>>>> <property>
>>>>>   <name>hive.exec.scratchdir</name>
>>>>>   <value>/tmp/hive-${user.name}</value>
>>>>>   <description>Scratch space for Hive jobs</description>
>>>>> </property>
>>>>> 
>>>>> <property>
>>>>>   <name>hive.exec.local.scratchdir</name>
>>>>>   <value>/tmp/${user.name}</value>
>>>>>   <description>Local scratch space for Hive jobs</description>
>>>>> </property>
>>>>> 
>>>>> 
>>>>> 
>>>>> Will this related to configuration issue or a bug?
>>>>> 
>>>>> Please help!
>>>>> 
>>>>> Regards
>>>>> Arthur
>>>>> 
>>>>> 
>>>>> On 6 Jan, 2015, at 3:45 am, Jason Dere <jd...@hortonworks.com> wrote:
>>>>> 
>>>>>> During query compilation Hive needs to instantiate the UDF class and so the JAR needs to be resolvable by the class loader, thus the JAR is copied locally to a temp location for use.
>>>>>> During map/reduce jobs the local jar (like all jars added with the ADD JAR command) should then be added to the distributed cache. It looks like this is where the issue is occurring, but based on path in the error message I suspect that either Hive or Hadoop is mistaking what should be a local path with an HDFS path.
>>>>>> 
>>>>>> On Jan 4, 2015, at 10:23 AM, Arthur.hk.chan@gmail.com <ar...@gmail.com> wrote:
>>>>>> 
>>>>>>> Hi,
>>>>>>> 
>>>>>>> A question: Why does it need to copy the jar file to the temp folder? Why couldn’t it use the file defined in using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar' directly? 
>>>>>>> 
>>>>>>> Regards
>>>>>>> Arthur
>>>>>>> 
>>>>>>> 
>>>>>>> On 4 Jan, 2015, at 7:48 am, Arthur.hk.chan@gmail.com <ar...@gmail.com> wrote:
>>>>>>> 
>>>>>>>> Hi,
>>>>>>>> 
>>>>>>>> 
>>>>>>>> A1: Are all of these commands (Step 1-5) from the same Hive CLI prompt?
>>>>>>>> Yes
>>>>>>>> 
>>>>>>>> A2:  Would you be able to check if such a file exists with the same path, on the local file system?
>>>>>>>> The file does not exist on the local file system.  
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Is there a way to set the another “tmp" folder for HIVE? or any suggestions to fix this issue?
>>>>>>>> 
>>>>>>>> Thanks !!
>>>>>>>> 
>>>>>>>> Arthur
>>>>>>>>  
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On 3 Jan, 2015, at 4:12 am, Jason Dere <jd...@hortonworks.com> wrote:
>>>>>>>> 
>>>>>>>>> The point of USING JAR as part of the CREATE FUNCTION statement to try to avoid having to do ADD JAR/aux path stuff to get the UDF to work. 
>>>>>>>>> 
>>>>>>>>> Are all of these commands (Step 1-5) from the same Hive CLI prompt?
>>>>>>>>> 
>>>>>>>>>>> hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar';
>>>>>>>>>>> converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>>>>>>> Added /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar to class path
>>>>>>>>>>> Added resource: /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>>>>>>> OK
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> One note, /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar here should actually be on the local file system, not on HDFS where you were checking in Step 5. During CREATE FUNCTION/query compilation, Hive will make a copy of the source JAR (hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar), copied to a temp location on the local file system where it's used by that Hive session.
>>>>>>>>> 
>>>>>>>>> The location mentioned in the FileNotFoundException (hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar) has a different path than the local copy mentioned during CREATE FUNCTION (/tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar). I'm not really sure why it is a HDFS path here either, but I'm not too familiar with what goes on during the job submission process. But the fact that this HDFS path has the same naming convention as the directory used for downloading resources locally (***_resources) looks a little fishy to me. Would you be able to check if such a file exists with the same path, on the local file system?
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Dec 31, 2014, at 5:22 AM, Nirmal Kumar <ni...@impetus.co.in> wrote:
>>>>>>>>> 
>>>>>>>>>>   Important: HiveQL's ADD JAR operation does not work with HiveServer2 and the Beeline client when Beeline runs on a different host. As an alterntive to ADD JAR, Hive auxiliary path functionality should be used as described below.
>>>>>>>>>> 
>>>>>>>>>> Refer:
>>>>>>>>>> http://www.cloudera.com/content/cloudera/en/documentation/cloudera-manager/v4-8-0/Cloudera-Manager-Managing-Clusters/cmmc_hive_udf.html
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> -Nirmal
>>>>>>>>>> 
>>>>>>>>>> From: Arthur.hk.chan@gmail.com <ar...@gmail.com>
>>>>>>>>>> Sent: Tuesday, December 30, 2014 9:54 PM
>>>>>>>>>> To: vic0777
>>>>>>>>>> Cc: Arthur.hk.chan@gmail.com; user@hive.apache.org
>>>>>>>>>> Subject: Re: CREATE FUNCTION: How to automatically load extra jar file?
>>>>>>>>>>  
>>>>>>>>>> Thank you.
>>>>>>>>>> 
>>>>>>>>>> Will this work for hiveserver2 ?
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Arthur
>>>>>>>>>> 
>>>>>>>>>> On 30 Dec, 2014, at 2:24 pm, vic0777 <vi...@163.com> wrote:
>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> You can put it into $HOME/.hiverc like this: ADD JAR full_path_of_the_jar. Then, the file is automatically loaded when Hive is started.
>>>>>>>>>>> 
>>>>>>>>>>> Wantao
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> At 2014-12-30 11:01:06, "Arthur.hk.chan@gmail.com" <ar...@gmail.com> wrote:
>>>>>>>>>>> Hi,
>>>>>>>>>>> 
>>>>>>>>>>> I am using Hive 0.13.1 on Hadoop 2.4.1, I need to automatically load an extra JAR file to hive for UDF, below are my steps to create the UDF function. I have tried the following but still no luck to get thru.
>>>>>>>>>>> 
>>>>>>>>>>> Please help!!
>>>>>>>>>>> 
>>>>>>>>>>> Regards
>>>>>>>>>>> Arthur
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> Step 1:   (make sure the jar in in HDFS)
>>>>>>>>>>> hive> dfs -ls hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar;
>>>>>>>>>>> -rw-r--r--   3 hadoop hadoop      57388 2014-12-30 10:02hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>>>>>>> 
>>>>>>>>>>> Step 2: (drop if function exists) 
>>>>>>>>>>> hive> drop function sysdate;                                                  
>>>>>>>>>>> OK
>>>>>>>>>>> Time taken: 0.013 seconds
>>>>>>>>>>> 
>>>>>>>>>>> Step 3: (create function using the jar in HDFS)
>>>>>>>>>>> hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar';
>>>>>>>>>>> converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>>>>>>> Added /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar to class path
>>>>>>>>>>> Added resource: /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>>>>>>> OK
>>>>>>>>>>> Time taken: 0.034 seconds
>>>>>>>>>>> 
>>>>>>>>>>> Step 4: (test)
>>>>>>>>>>> hive> select sysdate();                                                                                                                                
>>>>>>>>>>> Automatically selecting local only mode for query
>>>>>>>>>>> Total jobs = 1
>>>>>>>>>>> Launching Job 1 out of 1
>>>>>>>>>>> Number of reduce tasks is set to 0 since there's no reduce operator
>>>>>>>>>>> SLF4J: Class path contains multiple SLF4J bindings.
>>>>>>>>>>> SLF4J: Found binding in [jar:file:/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>>>>>>>>>> SLF4J: Found binding in [jar:file:/hadoop/hbase-0.98.5-hadoop2/lib/phoenix-4.1.0-client-hadoop2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>>>>>>>>>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
>>>>>>>>>>> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
>>>>>>>>>>> 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
>>>>>>>>>>> 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: yarn.nodemanager.loacl-dirs;  Ignoring.
>>>>>>>>>>> 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
>>>>>>>>>>> Execution log at: /tmp/hadoop/hadoop_20141230101717_282ec475-8621-40fa-8178-a7927d81540b.log
>>>>>>>>>>> java.io.FileNotFoundException: File does not exist:hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>>>>>>> at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1128)
>>>>>>>>>>> at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
>>>>>>>>>>> at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>>>>>>>>>>> at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120)
>>>>>>>>>>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
>>>>>>>>>>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
>>>>>>>>>>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:99)
>>>>>>>>>>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
>>>>>>>>>>> at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:265)
>>>>>>>>>>> at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:301)
>>>>>>>>>>> at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:389)
>>>>>>>>>>> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
>>>>>>>>>>> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
>>>>>>>>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>>>>>>>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>>>>>>>> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>>>>>>>>>>> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
>>>>>>>>>>> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
>>>>>>>>>>> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
>>>>>>>>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>>>>>>>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>>>>>>>> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>>>>>>>>>>> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
>>>>>>>>>>> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
>>>>>>>>>>> at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:420)
>>>>>>>>>>> at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:740)
>>>>>>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>>>>>>>>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>>>>>>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>>>>>>>>>> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
>>>>>>>>>>> Job Submission failed with exception 'java.io.FileNotFoundException(File does not exist:hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar)'
>>>>>>>>>>> Execution failed with exit status: 1
>>>>>>>>>>> Obtaining error information
>>>>>>>>>>> Task failed!
>>>>>>>>>>> Task ID:
>>>>>>>>>>>   Stage-1
>>>>>>>>>>> Logs:
>>>>>>>>>>> /tmp/hadoop/hive.log
>>>>>>>>>>> FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> Step 5: (check the file)
>>>>>>>>>>> hive> dfs -ls /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar;
>>>>>>>>>>> ls: `/tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar': No such file or directory
>>>>>>>>>>> Command failed with exit code = 1
>>>>>>>>>>> Query returned non-zero code: 1, cause: null
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> CONFIDENTIALITY NOTICE
>>>>>>>>> NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> CONFIDENTIALITY NOTICE
>>>>>> NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
>>>>> 
>>>> 
>>>> 
>>>> CONFIDENTIALITY NOTICE
>>>> NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
>>> 
>> 
> 


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: CREATE FUNCTION: How to automatically load extra jar file?

Posted by "Arthur.hk.chan@gmail.com" <ar...@gmail.com>.

Hi,
 
I have deleted the original hive metadata database from mysql,  re-created a new one with "character set ='latin1';
also put the jar file into HDFS with a shorter file name, the  'max key length is 767 bytes’ issue from mysql is resolved.


Tried again:
1) drop function sysdate;
2) CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' using JAR 'hdfs://mycluster/hadoop/nexr.jar';

3) (check the hive log)
2015-01-15 15:05:43,133 INFO  [main]: ql.Driver (Driver.java:getSchema(238)) - Returning Hive schema: Schema(fieldSchemas:null, properties:null)
2015-01-15 15:05:43,133 INFO  [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=compile start=1421305543126 end=1421305543133 duration=7 from=org.apache.hadoop.hive.ql.Driver>
2015-01-15 15:05:43,133 INFO  [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) - <PERFLOG method=Driver.execute from=org.apache.hadoop.hive.ql.Driver>
2015-01-15 15:05:43,133 INFO  [main]: ql.Driver (Driver.java:execute(1192)) - Starting command: CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' using JAR 'hdfs://mycluster/hadoop/nexr.jar'
2015-01-15 15:05:43,134 INFO  [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=TimeToSubmit start=1421305543125 end=1421305543134 duration=9 from=org.apache.hadoop.hive.ql.Driver>
2015-01-15 15:05:43,134 INFO  [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) - <PERFLOG method=runTasks from=org.apache.hadoop.hive.ql.Driver>
2015-01-15 15:05:43,134 INFO  [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) - <PERFLOG method=task.FUNCTION.Stage-0 from=org.apache.hadoop.hive.ql.Driver>
2015-01-15 15:05:43,135 INFO  [main]: SessionState (SessionState.java:printInfo(536)) - converting to local  hdfs://mycluster/hadoop/nexr.jar
2015-01-15 15:05:43,142 INFO  [main]: SessionState (SessionState.java:printInfo(536)) - Added /tmp/606e6a26-775f-40c8-be18-a670deee2f7e_resources/nexr.jar to class path
2015-01-15 15:05:43,142 INFO  [main]: SessionState (SessionState.java:printInfo(536)) - Added resource: /tmp/606e6a26-775f-40c8-be18-a670deee2f7e_resources/nexr.jar

4) select sysdate();                                                                                                    
converting to local hdfs://mycluster/hadoop/nexr.jar
Added /tmp/abce45b1-6041-40b6-83ed-8c6491216360_resources/nexr.jar to class path
Added resource: /tmp/abce45b1-6041-40b6-83ed-8c6491216360_resources/nexr.jar
Automatically selecting local only mode for query
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
java.io.FileNotFoundException: File does not exist: hdfs://mycluster/tmp/abce45b1-6041-40b6-83ed-8c6491216360_resources/nexr.jar
	at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1128)
	at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
	at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120)
	at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
	at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
	at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:93)
	at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
	at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:265)
	at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:301)
	at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:389)
	at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
	at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
	at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
	at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
	at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
	at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
	at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
	at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:420)
	at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:740)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Job Submission failed with exception 'java.io.FileNotFoundException(File does not exist: hdfs://mycluster/tmp/abce45b1-6041-40b6-83ed-8c6491216360_resources/nexr.jar)'
Execution failed with exit status: 1
Obtaining error information
Task failed!

5) I cannot find the "abce45b1-6041-40b6-83ed-8c6491216360_resources/nexr.jar” in "hdfs://mycluster/tmp/abce45b1-6041-40b6-83ed-8c6491216360_resources/nexr.jar”,
also not found in local /tmp/ folder.


I think it should be the case here that “libjars setting for the map/reduce job is somehow getting sent without the "file:///", which might be causing hadoop to interpret the path as a HDFS path rather than a local path."

Is there a way to verify my libjars setting for map/reduce job?

Please help!
Regards
Arthur




On 11 Jan, 2015, at 5:35 pm, Arthur.hk.chan@gmail.com <ar...@gmail.com> wrote:

> Hi,
> 
> 
> 
> mysql> show variables like "character_set_database";
> +------------------------+--------+
> | Variable_name          | Value  |
> +------------------------+--------+
> | character_set_database | latin1 |
> +------------------------+--------+
> 1 row in set (0.00 sec)
> 
> mysql> show variables like "collation_database";
> +--------------------+-------------------+
> | Variable_name      | Value             |
> +--------------------+-------------------+
> | collation_database | latin1_swedish_ci |
> +--------------------+-------------------+
> 1 row in set (0.00 sec)
> 
> 
> 
> 2015-01-11 17:21:07,835 ERROR [main]: DataNucleus.Datastore (Log4JLogger.java:error(115)) - An exception was thrown while adding/validating class(es) : Specified key was too long; max key length is 767 bytes
> com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Specified key was too long; max key length is 767 bytes
> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> 	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> 	at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> 	at com.mysql.jdbc.Util.handleNewInstance(Util.java:408)
> 	at com.mysql.jdbc.Util.getInstance(Util.java:383)
> 	at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1062)
> 	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4226)
> 	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4158)
> 	at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2615)
> 	at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2776)
> 	at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2834)
> 	at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2783)
> 	at com.mysql.jdbc.StatementImpl.execute(StatementImpl.java:908)
> 	at com.mysql.jdbc.StatementImpl.execute(StatementImpl.java:788)
> 	at com.jolbox.bonecp.StatementHandle.execute(StatementHandle.java:254)
> 	at org.datanucleus.store.rdbms.table.AbstractTable.executeDdlStatement(AbstractTable.java:760)
> 	at org.datanucleus.store.rdbms.table.TableImpl.createIndices(TableImpl.java:648)
> 	at org.datanucleus.store.rdbms.table.TableImpl.validateIndices(TableImpl.java:593)
> 	at org.datanucleus.store.rdbms.table.TableImpl.validateConstraints(TableImpl.java:390)
> 	at org.datanucleus.store.rdbms.table.ClassTable.validateConstraints(ClassTable.java:3463)
> 	at org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.performTablesValidation(RDBMSStoreManager.java:3464)
> 	at org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.addClassTablesAndValidate(RDBMSStoreManager.java:3190)
> 	at org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.run(RDBMSStoreManager.java:2841)
> 	at org.datanucleus.store.rdbms.AbstractSchemaTransaction.execute(AbstractSchemaTransaction.java:122)
> 	at org.datanucleus.store.rdbms.RDBMSStoreManager.addClasses(RDBMSStoreManager.java:1605)
> 	at org.datanucleus.store.AbstractStoreManager.addClass(AbstractStoreManager.java:954)
> 	at org.datanucleus.store.rdbms.RDBMSStoreManager.getDatastoreClass(RDBMSStoreManager.java:679)
> 	at org.datanucleus.store.rdbms.query.RDBMSQueryUtils.getStatementForCandidates(RDBMSQueryUtils.java:408)
> 	at org.datanucleus.store.rdbms.query.JDOQLQuery.compileQueryFull(JDOQLQuery.java:947)
> 	at org.datanucleus.store.rdbms.query.JDOQLQuery.compileInternal(JDOQLQuery.java:370)
> 	at org.datanucleus.store.query.Query.executeQuery(Query.java:1744)
> 	at org.datanucleus.store.query.Query.executeWithArray(Query.java:1672)
> 	at org.datanucleus.store.query.Query.execute(Query.java:1654)
> 	at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:221)
> 	at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.<init>(MetaStoreDirectSql.java:121)
> 	at org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:252)
> 	at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:223)
> 	at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
> 	at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
> 	at org.apache.hadoop.hive.metastore.RawStoreProxy.<init>(RawStoreProxy.java:58)
> 	at org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:67)
> 	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:497)
> 	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:475)
> 	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:523)
> 	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:397)
> 	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.<init>(HiveMetaStore.java:356)
> 	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:54)
> 	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:59)
> 	at org.apache.hadoop.hive.metastore.HiveMetaStore.newHMSHandler(HiveMetaStore.java:4944)
> 	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:171)
> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> 	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> 	at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> 	at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1410)
> 	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:62)
> 	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:72)
> 	at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2453)
> 	at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2465)
> 	at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:340)
> 	at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
> 	at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:606)
> 	at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> 
> 2015-01-11 17:21:07,835 ERROR [main]: DataNucleus.Datastore (Log4JLogger.java:error(115)) - An exception was thrown while adding/validating class(es) : Specified key was too long; max key length is 767 bytes
> com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Specified key was too long; max key length is 767 bytes
> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> 	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> 	at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> 	at com.mysql.jdbc.Util.handleNewInstance(Util.java:408)
> 	at com.mysql.jdbc.Util.getInstance(Util.java:383)
> 	at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1062)
> 	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4226)
> 	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4158)
> 	at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2615)
> 	at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2776)
> 	at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2834)
> 	at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2783)
> 	at com.mysql.jdbc.StatementImpl.execute(StatementImpl.java:908)
> 	at com.mysql.jdbc.StatementImpl.execute(StatementImpl.java:788)
> 	at com.jolbox.bonecp.StatementHandle.execute(StatementHandle.java:254)
> 	at org.datanucleus.store.rdbms.table.AbstractTable.executeDdlStatement(AbstractTable.java:760)
> 	at org.datanucleus.store.rdbms.table.TableImpl.createIndices(TableImpl.java:648)
> 	at org.datanucleus.store.rdbms.table.TableImpl.validateIndices(TableImpl.java:593)
> 	at org.datanucleus.store.rdbms.table.TableImpl.validateConstraints(TableImpl.java:390)
> 	at org.datanucleus.store.rdbms.table.ClassTable.validateConstraints(ClassTable.java:3463)
> 	at org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.performTablesValidation(RDBMSStoreManager.java:3464)
> 	at org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.addClassTablesAndValidate(RDBMSStoreManager.java:3190)
> 	at org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.run(RDBMSStoreManager.java:2841)
> 	at org.datanucleus.store.rdbms.AbstractSchemaTransaction.execute(AbstractSchemaTransaction.java:122)
> 	at org.datanucleus.store.rdbms.RDBMSStoreManager.addClasses(RDBMSStoreManager.java:1605)
> 	at org.datanucleus.store.AbstractStoreManager.addClass(AbstractStoreManager.java:954)
> 	at org.datanucleus.store.rdbms.RDBMSStoreManager.getDatastoreClass(RDBMSStoreManager.java:679)
> 	at org.datanucleus.store.rdbms.query.RDBMSQueryUtils.getStatementForCandidates(RDBMSQueryUtils.java:408)
> 	at org.datanucleus.store.rdbms.query.JDOQLQuery.compileQueryFull(JDOQLQuery.java:947)
> 	at org.datanucleus.store.rdbms.query.JDOQLQuery.compileInternal(JDOQLQuery.java:370)
> 	at org.datanucleus.store.query.Query.executeQuery(Query.java:1744)
> 	at org.datanucleus.store.query.Query.executeWithArray(Query.java:1672)
> 	at org.datanucleus.store.query.Query.execute(Query.java:1654)
> 	at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:221)
> 	at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.<init>(MetaStoreDirectSql.java:121)
> 	at org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:252)
> 	at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:223)
> 	at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
> 	at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
> 	at org.apache.hadoop.hive.metastore.RawStoreProxy.<init>(RawStoreProxy.java:58)
> 	at org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:67)
> 	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:497)
> 	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:475)
> 	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:523)
> 	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:397)
> 	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.<init>(HiveMetaStore.java:356)
> 	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:54)
> 	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:59)
> 	at org.apache.hadoop.hive.metastore.HiveMetaStore.newHMSHandler(HiveMetaStore.java:4944)
> 	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:171)
> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> 	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> 	at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> 	at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1410)
> 	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:62)
> 	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:72)
> 	at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2453)
> 	at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2465)
> 	at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:340)
> 	at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
> 	at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:606)
> 	at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> 
> 
> What would be wrong?
> Regards
> Arthur
> 
> 
> On 11 Jan, 2015, at 5:18 pm, Arthur.hk.chan@gmail.com <ar...@gmail.com> wrote:
> 
>> Hi,
>> 
>> 
>> 2015-01-04 08:57:12,154 ERROR [main]: DataNucleus.Datastore (Log4JLogger.java:error(115)) - An exception was thrown while adding/validating class(es) : Specified key was too long; max key length is 767 bytes
>> com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Specified key was too long; max key length is 767 bytes
>> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>> 	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>> 	at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>> 	at com.mysql.jdbc.Util.handleNewInstance(Util.java:408)
>> 	at com.mysql.jdbc.Util.getInstance(Util.java:383)
>> 	at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1062)
>> 	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4226)
>> 	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4158)
>> 	at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2615)
>> 	at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2776)
>> 	at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2834)
>> 	at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2783)
>> 	at com.mysql.jdbc.StatementImpl.execute(StatementImpl.java:908)
>> 	at com.mysql.jdbc.StatementImpl.execute(StatementImpl.java:788)
>> 	at com.jolbox.bonecp.StatementHandle.execute(StatementHandle.java:254)
>> 	at org.datanucleus.store.rdbms.table.AbstractTable.executeDdlStatement(AbstractTable.java:760)
>> 	at org.datanucleus.store.rdbms.table.TableImpl.createIndices(TableImpl.java:648)
>> 	at org.datanucleus.store.rdbms.table.TableImpl.validateIndices(TableImpl.java:593)
>> 	at org.datanucleus.store.rdbms.table.TableImpl.validateConstraints(TableImpl.java:390)
>> 	at org.datanucleus.store.rdbms.table.ClassTable.validateConstraints(ClassTable.java:3463)
>> 	at org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.performTablesValidation(RDBMSStoreManager.java:3464)
>> 	at org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.addClassTablesAndValidate(RDBMSStoreManager.java:3190)
>> 	at org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.run(RDBMSStoreManager.java:2841)
>> 	at org.datanucleus.store.rdbms.AbstractSchemaTransaction.execute(AbstractSchemaTransaction.java:122)
>> 	at org.datanucleus.store.rdbms.RDBMSStoreManager.addClasses(RDBMSStoreManager.java:1605)
>> 	at org.datanucleus.store.AbstractStoreManager.addClass(AbstractStoreManager.java:954)
>> 	at org.datanucleus.store.rdbms.RDBMSStoreManager.getDatastoreClass(RDBMSStoreManager.java:679)
>> 	at org.datanucleus.store.rdbms.query.RDBMSQueryUtils.getStatementForCandidates(RDBMSQueryUtils.java:408)
>> 	at org.datanucleus.store.rdbms.query.JDOQLQuery.compileQueryFull(JDOQLQuery.java:947)
>> 	at org.datanucleus.store.rdbms.query.JDOQLQuery.compileInternal(JDOQLQuery.java:370)
>> 	at org.datanucleus.store.query.Query.executeQuery(Query.java:1744)
>> 	at org.datanucleus.store.query.Query.executeWithArray(Query.java:1672)
>> 	at org.datanucleus.store.query.Query.execute(Query.java:1654)
>> 	at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:221)
>> 	at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.<init>(MetaStoreDirectSql.java:121)
>> 	at org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:252)
>> 	at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:223)
>> 	at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
>> 	at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
>> 	at org.apache.hadoop.hive.metastore.RawStoreProxy.<init>(RawStoreProxy.java:58)
>> 	at org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:67)
>> 	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:497)
>> 	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:475)
>> 	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_all_databases(HiveMetaStore.java:1026)
>> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> 	at java.lang.reflect.Method.invoke(Method.java:606)
>> 	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
>> 	at com.sun.proxy.$Proxy10.get_all_databases(Unknown Source)
>> 	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getAllDatabases(HiveMetaStoreClient.java:837)
>> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> 	at java.lang.reflect.Method.invoke(Method.java:606)
>> 	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
>> 	at com.sun.proxy.$Proxy11.getAllDatabases(Unknown Source)
>> 	at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1098)
>> 	at org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionNames(FunctionRegistry.java:671)
>> 	at org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionNames(FunctionRegistry.java:662)
>> 	at org.apache.hadoop.hive.cli.CliDriver.getCommandCompletor(CliDriver.java:540)
>> 	at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:758)
>> 	at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)
>> 	at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
>> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> 	at java.lang.reflect.Method.invoke(Method.java:606)
>> 	at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
>> 
>> Regards
>> Arthur
>> 
>> 
>> 
>> On 7 Jan, 2015, at 7:22 am, Jason Dere <jd...@hortonworks.com> wrote:
>> 
>>> Does your hive.log contain any lines with "adding libjars:"?
>>> 
>>> May also search for any lines containing "_resources", would like to see the result of both searches.
>>> 
>>> For example, mine is showing the following line:
>>> 2015-01-06 14:53:28,115 INFO  mr.ExecDriver (ExecDriver.java:execute(307)) - adding libjars: file:///tmp/d0ed1585-d9e6-4944-b985-225351574de0_resources/spatial-sdk-hive-1.0.3-SNAPSHOT.jar,file:///tmp/d0ed1585-d9e6-4944-b985-225351574de0_resources/esri-geometry-api.jar
>>> 
>>> I wonder if your libjars setting for the map/reduce job is somehow getting sent without the "file:///", which might be causing hadoop to interpret the path as a HDFS path rather than a local path.
>>> 
>>> On Jan 6, 2015, at 1:11 AM, Arthur.hk.chan <ar...@gmail.com> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> my hadoop’s core-site.xml contains following about tmp
>>>> 
>>>> <property>
>>>>   <name>hadoop.tmp.dir</name>
>>>>   <value>/hadoop_data/hadoop_data/tmp</value>
>>>> </property>
>>>> 
>>>> 
>>>> 
>>>> my hive-default.xml contains following about tmp
>>>> 
>>>> <property>
>>>>   <name>hive.exec.scratchdir</name>
>>>>   <value>/tmp/hive-${user.name}</value>
>>>>   <description>Scratch space for Hive jobs</description>
>>>> </property>
>>>> 
>>>> <property>
>>>>   <name>hive.exec.local.scratchdir</name>
>>>>   <value>/tmp/${user.name}</value>
>>>>   <description>Local scratch space for Hive jobs</description>
>>>> </property>
>>>> 
>>>> 
>>>> 
>>>> Will this related to configuration issue or a bug?
>>>> 
>>>> Please help!
>>>> 
>>>> Regards
>>>> Arthur
>>>> 
>>>> 
>>>> On 6 Jan, 2015, at 3:45 am, Jason Dere <jd...@hortonworks.com> wrote:
>>>> 
>>>>> During query compilation Hive needs to instantiate the UDF class and so the JAR needs to be resolvable by the class loader, thus the JAR is copied locally to a temp location for use.
>>>>> During map/reduce jobs the local jar (like all jars added with the ADD JAR command) should then be added to the distributed cache. It looks like this is where the issue is occurring, but based on path in the error message I suspect that either Hive or Hadoop is mistaking what should be a local path with an HDFS path.
>>>>> 
>>>>> On Jan 4, 2015, at 10:23 AM, Arthur.hk.chan@gmail.com <ar...@gmail.com> wrote:
>>>>> 
>>>>>> Hi,
>>>>>> 
>>>>>> A question: Why does it need to copy the jar file to the temp folder? Why couldn’t it use the file defined in using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar' directly? 
>>>>>> 
>>>>>> Regards
>>>>>> Arthur
>>>>>> 
>>>>>> 
>>>>>> On 4 Jan, 2015, at 7:48 am, Arthur.hk.chan@gmail.com <ar...@gmail.com> wrote:
>>>>>> 
>>>>>>> Hi,
>>>>>>> 
>>>>>>> 
>>>>>>> A1: Are all of these commands (Step 1-5) from the same Hive CLI prompt?
>>>>>>> Yes
>>>>>>> 
>>>>>>> A2:  Would you be able to check if such a file exists with the same path, on the local file system?
>>>>>>> The file does not exist on the local file system.  
>>>>>>> 
>>>>>>> 
>>>>>>> Is there a way to set the another “tmp" folder for HIVE? or any suggestions to fix this issue?
>>>>>>> 
>>>>>>> Thanks !!
>>>>>>> 
>>>>>>> Arthur
>>>>>>>  
>>>>>>> 
>>>>>>> 
>>>>>>> On 3 Jan, 2015, at 4:12 am, Jason Dere <jd...@hortonworks.com> wrote:
>>>>>>> 
>>>>>>>> The point of USING JAR as part of the CREATE FUNCTION statement to try to avoid having to do ADD JAR/aux path stuff to get the UDF to work. 
>>>>>>>> 
>>>>>>>> Are all of these commands (Step 1-5) from the same Hive CLI prompt?
>>>>>>>> 
>>>>>>>>>> hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar';
>>>>>>>>>> converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>>>>>> Added /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar to class path
>>>>>>>>>> Added resource: /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>>>>>> OK
>>>>>>>> 
>>>>>>>> 
>>>>>>>> One note, /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar here should actually be on the local file system, not on HDFS where you were checking in Step 5. During CREATE FUNCTION/query compilation, Hive will make a copy of the source JAR (hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar), copied to a temp location on the local file system where it's used by that Hive session.
>>>>>>>> 
>>>>>>>> The location mentioned in the FileNotFoundException (hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar) has a different path than the local copy mentioned during CREATE FUNCTION (/tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar). I'm not really sure why it is a HDFS path here either, but I'm not too familiar with what goes on during the job submission process. But the fact that this HDFS path has the same naming convention as the directory used for downloading resources locally (***_resources) looks a little fishy to me. Would you be able to check if such a file exists with the same path, on the local file system?
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Dec 31, 2014, at 5:22 AM, Nirmal Kumar <ni...@impetus.co.in> wrote:
>>>>>>>> 
>>>>>>>>>   Important: HiveQL's ADD JAR operation does not work with HiveServer2 and the Beeline client when Beeline runs on a different host. As an alterntive to ADD JAR, Hive auxiliary path functionality should be used as described below.
>>>>>>>>> 
>>>>>>>>> Refer:
>>>>>>>>> http://www.cloudera.com/content/cloudera/en/documentation/cloudera-manager/v4-8-0/Cloudera-Manager-Managing-Clusters/cmmc_hive_udf.html
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> -Nirmal
>>>>>>>>> 
>>>>>>>>> From: Arthur.hk.chan@gmail.com <ar...@gmail.com>
>>>>>>>>> Sent: Tuesday, December 30, 2014 9:54 PM
>>>>>>>>> To: vic0777
>>>>>>>>> Cc: Arthur.hk.chan@gmail.com; user@hive.apache.org
>>>>>>>>> Subject: Re: CREATE FUNCTION: How to automatically load extra jar file?
>>>>>>>>>  
>>>>>>>>> Thank you.
>>>>>>>>> 
>>>>>>>>> Will this work for hiveserver2 ?
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Arthur
>>>>>>>>> 
>>>>>>>>> On 30 Dec, 2014, at 2:24 pm, vic0777 <vi...@163.com> wrote:
>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> You can put it into $HOME/.hiverc like this: ADD JAR full_path_of_the_jar. Then, the file is automatically loaded when Hive is started.
>>>>>>>>>> 
>>>>>>>>>> Wantao
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> At 2014-12-30 11:01:06, "Arthur.hk.chan@gmail.com" <ar...@gmail.com> wrote:
>>>>>>>>>> Hi,
>>>>>>>>>> 
>>>>>>>>>> I am using Hive 0.13.1 on Hadoop 2.4.1, I need to automatically load an extra JAR file to hive for UDF, below are my steps to create the UDF function. I have tried the following but still no luck to get thru.
>>>>>>>>>> 
>>>>>>>>>> Please help!!
>>>>>>>>>> 
>>>>>>>>>> Regards
>>>>>>>>>> Arthur
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Step 1:   (make sure the jar in in HDFS)
>>>>>>>>>> hive> dfs -ls hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar;
>>>>>>>>>> -rw-r--r--   3 hadoop hadoop      57388 2014-12-30 10:02hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>>>>>> 
>>>>>>>>>> Step 2: (drop if function exists) 
>>>>>>>>>> hive> drop function sysdate;                                                  
>>>>>>>>>> OK
>>>>>>>>>> Time taken: 0.013 seconds
>>>>>>>>>> 
>>>>>>>>>> Step 3: (create function using the jar in HDFS)
>>>>>>>>>> hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar';
>>>>>>>>>> converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>>>>>> Added /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar to class path
>>>>>>>>>> Added resource: /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>>>>>> OK
>>>>>>>>>> Time taken: 0.034 seconds
>>>>>>>>>> 
>>>>>>>>>> Step 4: (test)
>>>>>>>>>> hive> select sysdate();                                                                                                                                
>>>>>>>>>> Automatically selecting local only mode for query
>>>>>>>>>> Total jobs = 1
>>>>>>>>>> Launching Job 1 out of 1
>>>>>>>>>> Number of reduce tasks is set to 0 since there's no reduce operator
>>>>>>>>>> SLF4J: Class path contains multiple SLF4J bindings.
>>>>>>>>>> SLF4J: Found binding in [jar:file:/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>>>>>>>>> SLF4J: Found binding in [jar:file:/hadoop/hbase-0.98.5-hadoop2/lib/phoenix-4.1.0-client-hadoop2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>>>>>>>>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
>>>>>>>>>> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
>>>>>>>>>> 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
>>>>>>>>>> 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: yarn.nodemanager.loacl-dirs;  Ignoring.
>>>>>>>>>> 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
>>>>>>>>>> Execution log at: /tmp/hadoop/hadoop_20141230101717_282ec475-8621-40fa-8178-a7927d81540b.log
>>>>>>>>>> java.io.FileNotFoundException: File does not exist:hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>>>>>> at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1128)
>>>>>>>>>> at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
>>>>>>>>>> at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>>>>>>>>>> at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120)
>>>>>>>>>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
>>>>>>>>>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
>>>>>>>>>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:99)
>>>>>>>>>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
>>>>>>>>>> at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:265)
>>>>>>>>>> at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:301)
>>>>>>>>>> at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:389)
>>>>>>>>>> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
>>>>>>>>>> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
>>>>>>>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>>>>>>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>>>>>>> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>>>>>>>>>> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
>>>>>>>>>> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
>>>>>>>>>> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
>>>>>>>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>>>>>>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>>>>>>> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>>>>>>>>>> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
>>>>>>>>>> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
>>>>>>>>>> at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:420)
>>>>>>>>>> at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:740)
>>>>>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>>>>>>>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>>>>>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>>>>>>>>> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
>>>>>>>>>> Job Submission failed with exception 'java.io.FileNotFoundException(File does not exist:hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar)'
>>>>>>>>>> Execution failed with exit status: 1
>>>>>>>>>> Obtaining error information
>>>>>>>>>> Task failed!
>>>>>>>>>> Task ID:
>>>>>>>>>>   Stage-1
>>>>>>>>>> Logs:
>>>>>>>>>> /tmp/hadoop/hive.log
>>>>>>>>>> FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Step 5: (check the file)
>>>>>>>>>> hive> dfs -ls /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar;
>>>>>>>>>> ls: `/tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar': No such file or directory
>>>>>>>>>> Command failed with exit code = 1
>>>>>>>>>> Query returned non-zero code: 1, cause: null
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> CONFIDENTIALITY NOTICE
>>>>>>>> NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
>>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>>> CONFIDENTIALITY NOTICE
>>>>> NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
>>>> 
>>> 
>>> 
>>> CONFIDENTIALITY NOTICE
>>> NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
>> 
>

Re: CREATE FUNCTION: How to automatically load extra jar file?

Posted by "Arthur.hk.chan@gmail.com" <ar...@gmail.com>.

Hi,



mysql> show variables like "character_set_database";
+------------------------+--------+
| Variable_name          | Value  |
+------------------------+--------+
| character_set_database | latin1 |
+------------------------+--------+
1 row in set (0.00 sec)

mysql> show variables like "collation_database";
+--------------------+-------------------+
| Variable_name      | Value             |
+--------------------+-------------------+
| collation_database | latin1_swedish_ci |
+--------------------+-------------------+
1 row in set (0.00 sec)



2015-01-11 17:21:07,835 ERROR [main]: DataNucleus.Datastore (Log4JLogger.java:error(115)) - An exception was thrown while adding/validating class(es) : Specified key was too long; max key length is 767 bytes
com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Specified key was too long; max key length is 767 bytes
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
	at com.mysql.jdbc.Util.handleNewInstance(Util.java:408)
	at com.mysql.jdbc.Util.getInstance(Util.java:383)
	at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1062)
	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4226)
	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4158)
	at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2615)
	at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2776)
	at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2834)
	at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2783)
	at com.mysql.jdbc.StatementImpl.execute(StatementImpl.java:908)
	at com.mysql.jdbc.StatementImpl.execute(StatementImpl.java:788)
	at com.jolbox.bonecp.StatementHandle.execute(StatementHandle.java:254)
	at org.datanucleus.store.rdbms.table.AbstractTable.executeDdlStatement(AbstractTable.java:760)
	at org.datanucleus.store.rdbms.table.TableImpl.createIndices(TableImpl.java:648)
	at org.datanucleus.store.rdbms.table.TableImpl.validateIndices(TableImpl.java:593)
	at org.datanucleus.store.rdbms.table.TableImpl.validateConstraints(TableImpl.java:390)
	at org.datanucleus.store.rdbms.table.ClassTable.validateConstraints(ClassTable.java:3463)
	at org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.performTablesValidation(RDBMSStoreManager.java:3464)
	at org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.addClassTablesAndValidate(RDBMSStoreManager.java:3190)
	at org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.run(RDBMSStoreManager.java:2841)
	at org.datanucleus.store.rdbms.AbstractSchemaTransaction.execute(AbstractSchemaTransaction.java:122)
	at org.datanucleus.store.rdbms.RDBMSStoreManager.addClasses(RDBMSStoreManager.java:1605)
	at org.datanucleus.store.AbstractStoreManager.addClass(AbstractStoreManager.java:954)
	at org.datanucleus.store.rdbms.RDBMSStoreManager.getDatastoreClass(RDBMSStoreManager.java:679)
	at org.datanucleus.store.rdbms.query.RDBMSQueryUtils.getStatementForCandidates(RDBMSQueryUtils.java:408)
	at org.datanucleus.store.rdbms.query.JDOQLQuery.compileQueryFull(JDOQLQuery.java:947)
	at org.datanucleus.store.rdbms.query.JDOQLQuery.compileInternal(JDOQLQuery.java:370)
	at org.datanucleus.store.query.Query.executeQuery(Query.java:1744)
	at org.datanucleus.store.query.Query.executeWithArray(Query.java:1672)
	at org.datanucleus.store.query.Query.execute(Query.java:1654)
	at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:221)
	at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.<init>(MetaStoreDirectSql.java:121)
	at org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:252)
	at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:223)
	at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
	at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
	at org.apache.hadoop.hive.metastore.RawStoreProxy.<init>(RawStoreProxy.java:58)
	at org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:67)
	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:497)
	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:475)
	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:523)
	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:397)
	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.<init>(HiveMetaStore.java:356)
	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:54)
	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:59)
	at org.apache.hadoop.hive.metastore.HiveMetaStore.newHMSHandler(HiveMetaStore.java:4944)
	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:171)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
	at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1410)
	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:62)
	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:72)
	at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2453)
	at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2465)
	at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:340)
	at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
	at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

2015-01-11 17:21:07,835 ERROR [main]: DataNucleus.Datastore (Log4JLogger.java:error(115)) - An exception was thrown while adding/validating class(es) : Specified key was too long; max key length is 767 bytes
com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Specified key was too long; max key length is 767 bytes
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
	at com.mysql.jdbc.Util.handleNewInstance(Util.java:408)
	at com.mysql.jdbc.Util.getInstance(Util.java:383)
	at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1062)
	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4226)
	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4158)
	at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2615)
	at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2776)
	at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2834)
	at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2783)
	at com.mysql.jdbc.StatementImpl.execute(StatementImpl.java:908)
	at com.mysql.jdbc.StatementImpl.execute(StatementImpl.java:788)
	at com.jolbox.bonecp.StatementHandle.execute(StatementHandle.java:254)
	at org.datanucleus.store.rdbms.table.AbstractTable.executeDdlStatement(AbstractTable.java:760)
	at org.datanucleus.store.rdbms.table.TableImpl.createIndices(TableImpl.java:648)
	at org.datanucleus.store.rdbms.table.TableImpl.validateIndices(TableImpl.java:593)
	at org.datanucleus.store.rdbms.table.TableImpl.validateConstraints(TableImpl.java:390)
	at org.datanucleus.store.rdbms.table.ClassTable.validateConstraints(ClassTable.java:3463)
	at org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.performTablesValidation(RDBMSStoreManager.java:3464)
	at org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.addClassTablesAndValidate(RDBMSStoreManager.java:3190)
	at org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.run(RDBMSStoreManager.java:2841)
	at org.datanucleus.store.rdbms.AbstractSchemaTransaction.execute(AbstractSchemaTransaction.java:122)
	at org.datanucleus.store.rdbms.RDBMSStoreManager.addClasses(RDBMSStoreManager.java:1605)
	at org.datanucleus.store.AbstractStoreManager.addClass(AbstractStoreManager.java:954)
	at org.datanucleus.store.rdbms.RDBMSStoreManager.getDatastoreClass(RDBMSStoreManager.java:679)
	at org.datanucleus.store.rdbms.query.RDBMSQueryUtils.getStatementForCandidates(RDBMSQueryUtils.java:408)
	at org.datanucleus.store.rdbms.query.JDOQLQuery.compileQueryFull(JDOQLQuery.java:947)
	at org.datanucleus.store.rdbms.query.JDOQLQuery.compileInternal(JDOQLQuery.java:370)
	at org.datanucleus.store.query.Query.executeQuery(Query.java:1744)
	at org.datanucleus.store.query.Query.executeWithArray(Query.java:1672)
	at org.datanucleus.store.query.Query.execute(Query.java:1654)
	at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:221)
	at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.<init>(MetaStoreDirectSql.java:121)
	at org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:252)
	at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:223)
	at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
	at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
	at org.apache.hadoop.hive.metastore.RawStoreProxy.<init>(RawStoreProxy.java:58)
	at org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:67)
	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:497)
	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:475)
	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:523)
	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:397)
	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.<init>(HiveMetaStore.java:356)
	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:54)
	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:59)
	at org.apache.hadoop.hive.metastore.HiveMetaStore.newHMSHandler(HiveMetaStore.java:4944)
	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:171)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
	at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1410)
	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:62)
	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:72)
	at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2453)
	at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2465)
	at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:340)
	at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
	at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:212)


What would be wrong?
Regards
Arthur


On 11 Jan, 2015, at 5:18 pm, Arthur.hk.chan@gmail.com <ar...@gmail.com> wrote:

> Hi,
> 
> 
> 2015-01-04 08:57:12,154 ERROR [main]: DataNucleus.Datastore (Log4JLogger.java:error(115)) - An exception was thrown while adding/validating class(es) : Specified key was too long; max key length is 767 bytes
> com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Specified key was too long; max key length is 767 bytes
> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> 	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> 	at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> 	at com.mysql.jdbc.Util.handleNewInstance(Util.java:408)
> 	at com.mysql.jdbc.Util.getInstance(Util.java:383)
> 	at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1062)
> 	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4226)
> 	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4158)
> 	at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2615)
> 	at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2776)
> 	at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2834)
> 	at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2783)
> 	at com.mysql.jdbc.StatementImpl.execute(StatementImpl.java:908)
> 	at com.mysql.jdbc.StatementImpl.execute(StatementImpl.java:788)
> 	at com.jolbox.bonecp.StatementHandle.execute(StatementHandle.java:254)
> 	at org.datanucleus.store.rdbms.table.AbstractTable.executeDdlStatement(AbstractTable.java:760)
> 	at org.datanucleus.store.rdbms.table.TableImpl.createIndices(TableImpl.java:648)
> 	at org.datanucleus.store.rdbms.table.TableImpl.validateIndices(TableImpl.java:593)
> 	at org.datanucleus.store.rdbms.table.TableImpl.validateConstraints(TableImpl.java:390)
> 	at org.datanucleus.store.rdbms.table.ClassTable.validateConstraints(ClassTable.java:3463)
> 	at org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.performTablesValidation(RDBMSStoreManager.java:3464)
> 	at org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.addClassTablesAndValidate(RDBMSStoreManager.java:3190)
> 	at org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.run(RDBMSStoreManager.java:2841)
> 	at org.datanucleus.store.rdbms.AbstractSchemaTransaction.execute(AbstractSchemaTransaction.java:122)
> 	at org.datanucleus.store.rdbms.RDBMSStoreManager.addClasses(RDBMSStoreManager.java:1605)
> 	at org.datanucleus.store.AbstractStoreManager.addClass(AbstractStoreManager.java:954)
> 	at org.datanucleus.store.rdbms.RDBMSStoreManager.getDatastoreClass(RDBMSStoreManager.java:679)
> 	at org.datanucleus.store.rdbms.query.RDBMSQueryUtils.getStatementForCandidates(RDBMSQueryUtils.java:408)
> 	at org.datanucleus.store.rdbms.query.JDOQLQuery.compileQueryFull(JDOQLQuery.java:947)
> 	at org.datanucleus.store.rdbms.query.JDOQLQuery.compileInternal(JDOQLQuery.java:370)
> 	at org.datanucleus.store.query.Query.executeQuery(Query.java:1744)
> 	at org.datanucleus.store.query.Query.executeWithArray(Query.java:1672)
> 	at org.datanucleus.store.query.Query.execute(Query.java:1654)
> 	at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:221)
> 	at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.<init>(MetaStoreDirectSql.java:121)
> 	at org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:252)
> 	at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:223)
> 	at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
> 	at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
> 	at org.apache.hadoop.hive.metastore.RawStoreProxy.<init>(RawStoreProxy.java:58)
> 	at org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:67)
> 	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:497)
> 	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:475)
> 	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_all_databases(HiveMetaStore.java:1026)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:606)
> 	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
> 	at com.sun.proxy.$Proxy10.get_all_databases(Unknown Source)
> 	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getAllDatabases(HiveMetaStoreClient.java:837)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:606)
> 	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
> 	at com.sun.proxy.$Proxy11.getAllDatabases(Unknown Source)
> 	at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1098)
> 	at org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionNames(FunctionRegistry.java:671)
> 	at org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionNames(FunctionRegistry.java:662)
> 	at org.apache.hadoop.hive.cli.CliDriver.getCommandCompletor(CliDriver.java:540)
> 	at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:758)
> 	at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)
> 	at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:606)
> 	at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> 
> Regards
> Arthur
> 
> 
> 
> On 7 Jan, 2015, at 7:22 am, Jason Dere <jd...@hortonworks.com> wrote:
> 
>> Does your hive.log contain any lines with "adding libjars:"?
>> 
>> May also search for any lines containing "_resources", would like to see the result of both searches.
>> 
>> For example, mine is showing the following line:
>> 2015-01-06 14:53:28,115 INFO  mr.ExecDriver (ExecDriver.java:execute(307)) - adding libjars: file:///tmp/d0ed1585-d9e6-4944-b985-225351574de0_resources/spatial-sdk-hive-1.0.3-SNAPSHOT.jar,file:///tmp/d0ed1585-d9e6-4944-b985-225351574de0_resources/esri-geometry-api.jar
>> 
>> I wonder if your libjars setting for the map/reduce job is somehow getting sent without the "file:///", which might be causing hadoop to interpret the path as a HDFS path rather than a local path.
>> 
>> On Jan 6, 2015, at 1:11 AM, Arthur.hk.chan <ar...@gmail.com> wrote:
>> 
>>> Hi,
>>> 
>>> my hadoop’s core-site.xml contains following about tmp
>>> 
>>> <property>
>>>   <name>hadoop.tmp.dir</name>
>>>   <value>/hadoop_data/hadoop_data/tmp</value>
>>> </property>
>>> 
>>> 
>>> 
>>> my hive-default.xml contains following about tmp
>>> 
>>> <property>
>>>   <name>hive.exec.scratchdir</name>
>>>   <value>/tmp/hive-${user.name}</value>
>>>   <description>Scratch space for Hive jobs</description>
>>> </property>
>>> 
>>> <property>
>>>   <name>hive.exec.local.scratchdir</name>
>>>   <value>/tmp/${user.name}</value>
>>>   <description>Local scratch space for Hive jobs</description>
>>> </property>
>>> 
>>> 
>>> 
>>> Will this related to configuration issue or a bug?
>>> 
>>> Please help!
>>> 
>>> Regards
>>> Arthur
>>> 
>>> 
>>> On 6 Jan, 2015, at 3:45 am, Jason Dere <jd...@hortonworks.com> wrote:
>>> 
>>>> During query compilation Hive needs to instantiate the UDF class and so the JAR needs to be resolvable by the class loader, thus the JAR is copied locally to a temp location for use.
>>>> During map/reduce jobs the local jar (like all jars added with the ADD JAR command) should then be added to the distributed cache. It looks like this is where the issue is occurring, but based on path in the error message I suspect that either Hive or Hadoop is mistaking what should be a local path with an HDFS path.
>>>> 
>>>> On Jan 4, 2015, at 10:23 AM, Arthur.hk.chan@gmail.com <ar...@gmail.com> wrote:
>>>> 
>>>>> Hi,
>>>>> 
>>>>> A question: Why does it need to copy the jar file to the temp folder? Why couldn’t it use the file defined in using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar' directly? 
>>>>> 
>>>>> Regards
>>>>> Arthur
>>>>> 
>>>>> 
>>>>> On 4 Jan, 2015, at 7:48 am, Arthur.hk.chan@gmail.com <ar...@gmail.com> wrote:
>>>>> 
>>>>>> Hi,
>>>>>> 
>>>>>> 
>>>>>> A1: Are all of these commands (Step 1-5) from the same Hive CLI prompt?
>>>>>> Yes
>>>>>> 
>>>>>> A2:  Would you be able to check if such a file exists with the same path, on the local file system?
>>>>>> The file does not exist on the local file system.  
>>>>>> 
>>>>>> 
>>>>>> Is there a way to set the another “tmp" folder for HIVE? or any suggestions to fix this issue?
>>>>>> 
>>>>>> Thanks !!
>>>>>> 
>>>>>> Arthur
>>>>>>  
>>>>>> 
>>>>>> 
>>>>>> On 3 Jan, 2015, at 4:12 am, Jason Dere <jd...@hortonworks.com> wrote:
>>>>>> 
>>>>>>> The point of USING JAR as part of the CREATE FUNCTION statement to try to avoid having to do ADD JAR/aux path stuff to get the UDF to work. 
>>>>>>> 
>>>>>>> Are all of these commands (Step 1-5) from the same Hive CLI prompt?
>>>>>>> 
>>>>>>>>> hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar';
>>>>>>>>> converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>>>>> Added /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar to class path
>>>>>>>>> Added resource: /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>>>>> OK
>>>>>>> 
>>>>>>> 
>>>>>>> One note, /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar here should actually be on the local file system, not on HDFS where you were checking in Step 5. During CREATE FUNCTION/query compilation, Hive will make a copy of the source JAR (hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar), copied to a temp location on the local file system where it's used by that Hive session.
>>>>>>> 
>>>>>>> The location mentioned in the FileNotFoundException (hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar) has a different path than the local copy mentioned during CREATE FUNCTION (/tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar). I'm not really sure why it is a HDFS path here either, but I'm not too familiar with what goes on during the job submission process. But the fact that this HDFS path has the same naming convention as the directory used for downloading resources locally (***_resources) looks a little fishy to me. Would you be able to check if such a file exists with the same path, on the local file system?
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On Dec 31, 2014, at 5:22 AM, Nirmal Kumar <ni...@impetus.co.in> wrote:
>>>>>>> 
>>>>>>>>   Important: HiveQL's ADD JAR operation does not work with HiveServer2 and the Beeline client when Beeline runs on a different host. As an alterntive to ADD JAR, Hive auxiliary path functionality should be used as described below.
>>>>>>>> 
>>>>>>>> Refer:
>>>>>>>> http://www.cloudera.com/content/cloudera/en/documentation/cloudera-manager/v4-8-0/Cloudera-Manager-Managing-Clusters/cmmc_hive_udf.html
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> -Nirmal
>>>>>>>> 
>>>>>>>> From: Arthur.hk.chan@gmail.com <ar...@gmail.com>
>>>>>>>> Sent: Tuesday, December 30, 2014 9:54 PM
>>>>>>>> To: vic0777
>>>>>>>> Cc: Arthur.hk.chan@gmail.com; user@hive.apache.org
>>>>>>>> Subject: Re: CREATE FUNCTION: How to automatically load extra jar file?
>>>>>>>>  
>>>>>>>> Thank you.
>>>>>>>> 
>>>>>>>> Will this work for hiveserver2 ?
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Arthur
>>>>>>>> 
>>>>>>>> On 30 Dec, 2014, at 2:24 pm, vic0777 <vi...@163.com> wrote:
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> You can put it into $HOME/.hiverc like this: ADD JAR full_path_of_the_jar. Then, the file is automatically loaded when Hive is started.
>>>>>>>>> 
>>>>>>>>> Wantao
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> At 2014-12-30 11:01:06, "Arthur.hk.chan@gmail.com" <ar...@gmail.com> wrote:
>>>>>>>>> Hi,
>>>>>>>>> 
>>>>>>>>> I am using Hive 0.13.1 on Hadoop 2.4.1, I need to automatically load an extra JAR file to hive for UDF, below are my steps to create the UDF function. I have tried the following but still no luck to get thru.
>>>>>>>>> 
>>>>>>>>> Please help!!
>>>>>>>>> 
>>>>>>>>> Regards
>>>>>>>>> Arthur
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Step 1:   (make sure the jar in in HDFS)
>>>>>>>>> hive> dfs -ls hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar;
>>>>>>>>> -rw-r--r--   3 hadoop hadoop      57388 2014-12-30 10:02hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>>>>> 
>>>>>>>>> Step 2: (drop if function exists) 
>>>>>>>>> hive> drop function sysdate;                                                  
>>>>>>>>> OK
>>>>>>>>> Time taken: 0.013 seconds
>>>>>>>>> 
>>>>>>>>> Step 3: (create function using the jar in HDFS)
>>>>>>>>> hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar';
>>>>>>>>> converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>>>>> Added /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar to class path
>>>>>>>>> Added resource: /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>>>>> OK
>>>>>>>>> Time taken: 0.034 seconds
>>>>>>>>> 
>>>>>>>>> Step 4: (test)
>>>>>>>>> hive> select sysdate();                                                                                                                               
>>>>>>>>> Automatically selecting local only mode for query
>>>>>>>>> Total jobs = 1
>>>>>>>>> Launching Job 1 out of 1
>>>>>>>>> Number of reduce tasks is set to 0 since there's no reduce operator
>>>>>>>>> SLF4J: Class path contains multiple SLF4J bindings.
>>>>>>>>> SLF4J: Found binding in [jar:file:/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>>>>>>>> SLF4J: Found binding in [jar:file:/hadoop/hbase-0.98.5-hadoop2/lib/phoenix-4.1.0-client-hadoop2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>>>>>>>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
>>>>>>>>> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
>>>>>>>>> 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
>>>>>>>>> 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: yarn.nodemanager.loacl-dirs;  Ignoring.
>>>>>>>>> 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
>>>>>>>>> Execution log at: /tmp/hadoop/hadoop_20141230101717_282ec475-8621-40fa-8178-a7927d81540b.log
>>>>>>>>> java.io.FileNotFoundException: File does not exist:hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>>>>> at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1128)
>>>>>>>>> at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
>>>>>>>>> at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>>>>>>>>> at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120)
>>>>>>>>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
>>>>>>>>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
>>>>>>>>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:99)
>>>>>>>>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
>>>>>>>>> at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:265)
>>>>>>>>> at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:301)
>>>>>>>>> at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:389)
>>>>>>>>> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
>>>>>>>>> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
>>>>>>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>>>>>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>>>>>> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>>>>>>>>> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
>>>>>>>>> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
>>>>>>>>> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
>>>>>>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>>>>>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>>>>>> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>>>>>>>>> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
>>>>>>>>> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
>>>>>>>>> at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:420)
>>>>>>>>> at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:740)
>>>>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>>>>>>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>>>>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>>>>>>>> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
>>>>>>>>> Job Submission failed with exception 'java.io.FileNotFoundException(File does not exist:hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar)'
>>>>>>>>> Execution failed with exit status: 1
>>>>>>>>> Obtaining error information
>>>>>>>>> Task failed!
>>>>>>>>> Task ID:
>>>>>>>>>   Stage-1
>>>>>>>>> Logs:
>>>>>>>>> /tmp/hadoop/hive.log
>>>>>>>>> FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Step 5: (check the file)
>>>>>>>>> hive> dfs -ls /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar;
>>>>>>>>> ls: `/tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar': No such file or directory
>>>>>>>>> Command failed with exit code = 1
>>>>>>>>> Query returned non-zero code: 1, cause: null
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
>>>>>>> 
>>>>>>> 
>>>>>>> CONFIDENTIALITY NOTICE
>>>>>>> NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
>>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> CONFIDENTIALITY NOTICE
>>>> NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
>>> 
>> 
>> 
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
>

Re: CREATE FUNCTION: How to automatically load extra jar file?

Posted by "Arthur.hk.chan@gmail.com" <ar...@gmail.com>.

Hi,


2015-01-04 08:57:12,154 ERROR [main]: DataNucleus.Datastore (Log4JLogger.java:error(115)) - An exception was thrown while adding/validating class(es) : Specified key was too long; max key length is 767 bytes
com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Specified key was too long; max key length is 767 bytes
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
	at com.mysql.jdbc.Util.handleNewInstance(Util.java:408)
	at com.mysql.jdbc.Util.getInstance(Util.java:383)
	at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1062)
	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4226)
	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4158)
	at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2615)
	at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2776)
	at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2834)
	at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2783)
	at com.mysql.jdbc.StatementImpl.execute(StatementImpl.java:908)
	at com.mysql.jdbc.StatementImpl.execute(StatementImpl.java:788)
	at com.jolbox.bonecp.StatementHandle.execute(StatementHandle.java:254)
	at org.datanucleus.store.rdbms.table.AbstractTable.executeDdlStatement(AbstractTable.java:760)
	at org.datanucleus.store.rdbms.table.TableImpl.createIndices(TableImpl.java:648)
	at org.datanucleus.store.rdbms.table.TableImpl.validateIndices(TableImpl.java:593)
	at org.datanucleus.store.rdbms.table.TableImpl.validateConstraints(TableImpl.java:390)
	at org.datanucleus.store.rdbms.table.ClassTable.validateConstraints(ClassTable.java:3463)
	at org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.performTablesValidation(RDBMSStoreManager.java:3464)
	at org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.addClassTablesAndValidate(RDBMSStoreManager.java:3190)
	at org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.run(RDBMSStoreManager.java:2841)
	at org.datanucleus.store.rdbms.AbstractSchemaTransaction.execute(AbstractSchemaTransaction.java:122)
	at org.datanucleus.store.rdbms.RDBMSStoreManager.addClasses(RDBMSStoreManager.java:1605)
	at org.datanucleus.store.AbstractStoreManager.addClass(AbstractStoreManager.java:954)
	at org.datanucleus.store.rdbms.RDBMSStoreManager.getDatastoreClass(RDBMSStoreManager.java:679)
	at org.datanucleus.store.rdbms.query.RDBMSQueryUtils.getStatementForCandidates(RDBMSQueryUtils.java:408)
	at org.datanucleus.store.rdbms.query.JDOQLQuery.compileQueryFull(JDOQLQuery.java:947)
	at org.datanucleus.store.rdbms.query.JDOQLQuery.compileInternal(JDOQLQuery.java:370)
	at org.datanucleus.store.query.Query.executeQuery(Query.java:1744)
	at org.datanucleus.store.query.Query.executeWithArray(Query.java:1672)
	at org.datanucleus.store.query.Query.execute(Query.java:1654)
	at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:221)
	at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.<init>(MetaStoreDirectSql.java:121)
	at org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:252)
	at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:223)
	at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
	at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
	at org.apache.hadoop.hive.metastore.RawStoreProxy.<init>(RawStoreProxy.java:58)
	at org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:67)
	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:497)
	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:475)
	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_all_databases(HiveMetaStore.java:1026)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
	at com.sun.proxy.$Proxy10.get_all_databases(Unknown Source)
	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getAllDatabases(HiveMetaStoreClient.java:837)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
	at com.sun.proxy.$Proxy11.getAllDatabases(Unknown Source)
	at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1098)
	at org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionNames(FunctionRegistry.java:671)
	at org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionNames(FunctionRegistry.java:662)
	at org.apache.hadoop.hive.cli.CliDriver.getCommandCompletor(CliDriver.java:540)
	at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:758)
	at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)
	at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

Regards
Arthur



On 7 Jan, 2015, at 7:22 am, Jason Dere <jd...@hortonworks.com> wrote:

> Does your hive.log contain any lines with "adding libjars:"?
> 
> May also search for any lines containing "_resources", would like to see the result of both searches.
> 
> For example, mine is showing the following line:
> 2015-01-06 14:53:28,115 INFO  mr.ExecDriver (ExecDriver.java:execute(307)) - adding libjars: file:///tmp/d0ed1585-d9e6-4944-b985-225351574de0_resources/spatial-sdk-hive-1.0.3-SNAPSHOT.jar,file:///tmp/d0ed1585-d9e6-4944-b985-225351574de0_resources/esri-geometry-api.jar
> 
> I wonder if your libjars setting for the map/reduce job is somehow getting sent without the "file:///", which might be causing hadoop to interpret the path as a HDFS path rather than a local path.
> 
> On Jan 6, 2015, at 1:11 AM, Arthur.hk.chan <ar...@gmail.com> wrote:
> 
>> Hi,
>> 
>> my hadoop’s core-site.xml contains following about tmp
>> 
>> <property>
>>   <name>hadoop.tmp.dir</name>
>>   <value>/hadoop_data/hadoop_data/tmp</value>
>> </property>
>> 
>> 
>> 
>> my hive-default.xml contains following about tmp
>> 
>> <property>
>>   <name>hive.exec.scratchdir</name>
>>   <value>/tmp/hive-${user.name}</value>
>>   <description>Scratch space for Hive jobs</description>
>> </property>
>> 
>> <property>
>>   <name>hive.exec.local.scratchdir</name>
>>   <value>/tmp/${user.name}</value>
>>   <description>Local scratch space for Hive jobs</description>
>> </property>
>> 
>> 
>> 
>> Will this related to configuration issue or a bug?
>> 
>> Please help!
>> 
>> Regards
>> Arthur
>> 
>> 
>> On 6 Jan, 2015, at 3:45 am, Jason Dere <jd...@hortonworks.com> wrote:
>> 
>>> During query compilation Hive needs to instantiate the UDF class and so the JAR needs to be resolvable by the class loader, thus the JAR is copied locally to a temp location for use.
>>> During map/reduce jobs the local jar (like all jars added with the ADD JAR command) should then be added to the distributed cache. It looks like this is where the issue is occurring, but based on path in the error message I suspect that either Hive or Hadoop is mistaking what should be a local path with an HDFS path.
>>> 
>>> On Jan 4, 2015, at 10:23 AM, Arthur.hk.chan@gmail.com <ar...@gmail.com> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> A question: Why does it need to copy the jar file to the temp folder? Why couldn’t it use the file defined in using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar' directly? 
>>>> 
>>>> Regards
>>>> Arthur
>>>> 
>>>> 
>>>> On 4 Jan, 2015, at 7:48 am, Arthur.hk.chan@gmail.com <ar...@gmail.com> wrote:
>>>> 
>>>>> Hi,
>>>>> 
>>>>> 
>>>>> A1: Are all of these commands (Step 1-5) from the same Hive CLI prompt?
>>>>> Yes
>>>>> 
>>>>> A2:  Would you be able to check if such a file exists with the same path, on the local file system?
>>>>> The file does not exist on the local file system.  
>>>>> 
>>>>> 
>>>>> Is there a way to set the another “tmp" folder for HIVE? or any suggestions to fix this issue?
>>>>> 
>>>>> Thanks !!
>>>>> 
>>>>> Arthur
>>>>>  
>>>>> 
>>>>> 
>>>>> On 3 Jan, 2015, at 4:12 am, Jason Dere <jd...@hortonworks.com> wrote:
>>>>> 
>>>>>> The point of USING JAR as part of the CREATE FUNCTION statement to try to avoid having to do ADD JAR/aux path stuff to get the UDF to work. 
>>>>>> 
>>>>>> Are all of these commands (Step 1-5) from the same Hive CLI prompt?
>>>>>> 
>>>>>>>> hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar';
>>>>>>>> converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>>>> Added /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar to class path
>>>>>>>> Added resource: /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>>>> OK
>>>>>> 
>>>>>> 
>>>>>> One note, /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar here should actually be on the local file system, not on HDFS where you were checking in Step 5. During CREATE FUNCTION/query compilation, Hive will make a copy of the source JAR (hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar), copied to a temp location on the local file system where it's used by that Hive session.
>>>>>> 
>>>>>> The location mentioned in the FileNotFoundException (hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar) has a different path than the local copy mentioned during CREATE FUNCTION (/tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar). I'm not really sure why it is a HDFS path here either, but I'm not too familiar with what goes on during the job submission process. But the fact that this HDFS path has the same naming convention as the directory used for downloading resources locally (***_resources) looks a little fishy to me. Would you be able to check if such a file exists with the same path, on the local file system?
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Dec 31, 2014, at 5:22 AM, Nirmal Kumar <ni...@impetus.co.in> wrote:
>>>>>> 
>>>>>>>   Important: HiveQL's ADD JAR operation does not work with HiveServer2 and the Beeline client when Beeline runs on a different host. As an alterntive to ADD JAR, Hive auxiliary path functionality should be used as described below.
>>>>>>> 
>>>>>>> Refer:
>>>>>>> http://www.cloudera.com/content/cloudera/en/documentation/cloudera-manager/v4-8-0/Cloudera-Manager-Managing-Clusters/cmmc_hive_udf.html
>>>>>>> 
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> -Nirmal
>>>>>>> 
>>>>>>> From: Arthur.hk.chan@gmail.com <ar...@gmail.com>
>>>>>>> Sent: Tuesday, December 30, 2014 9:54 PM
>>>>>>> To: vic0777
>>>>>>> Cc: Arthur.hk.chan@gmail.com; user@hive.apache.org
>>>>>>> Subject: Re: CREATE FUNCTION: How to automatically load extra jar file?
>>>>>>>  
>>>>>>> Thank you.
>>>>>>> 
>>>>>>> Will this work for hiveserver2 ?
>>>>>>> 
>>>>>>> 
>>>>>>> Arthur
>>>>>>> 
>>>>>>> On 30 Dec, 2014, at 2:24 pm, vic0777 <vi...@163.com> wrote:
>>>>>>> 
>>>>>>>> 
>>>>>>>> You can put it into $HOME/.hiverc like this: ADD JAR full_path_of_the_jar. Then, the file is automatically loaded when Hive is started.
>>>>>>>> 
>>>>>>>> Wantao
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> At 2014-12-30 11:01:06, "Arthur.hk.chan@gmail.com" <ar...@gmail.com> wrote:
>>>>>>>> Hi,
>>>>>>>> 
>>>>>>>> I am using Hive 0.13.1 on Hadoop 2.4.1, I need to automatically load an extra JAR file to hive for UDF, below are my steps to create the UDF function. I have tried the following but still no luck to get thru.
>>>>>>>> 
>>>>>>>> Please help!!
>>>>>>>> 
>>>>>>>> Regards
>>>>>>>> Arthur
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Step 1:   (make sure the jar in in HDFS)
>>>>>>>> hive> dfs -ls hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar;
>>>>>>>> -rw-r--r--   3 hadoop hadoop      57388 2014-12-30 10:02hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>>>> 
>>>>>>>> Step 2: (drop if function exists) 
>>>>>>>> hive> drop function sysdate;                                                  
>>>>>>>> OK
>>>>>>>> Time taken: 0.013 seconds
>>>>>>>> 
>>>>>>>> Step 3: (create function using the jar in HDFS)
>>>>>>>> hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar';
>>>>>>>> converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>>>> Added /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar to class path
>>>>>>>> Added resource: /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>>>> OK
>>>>>>>> Time taken: 0.034 seconds
>>>>>>>> 
>>>>>>>> Step 4: (test)
>>>>>>>> hive> select sysdate();                                                                                                                                
>>>>>>>> Automatically selecting local only mode for query
>>>>>>>> Total jobs = 1
>>>>>>>> Launching Job 1 out of 1
>>>>>>>> Number of reduce tasks is set to 0 since there's no reduce operator
>>>>>>>> SLF4J: Class path contains multiple SLF4J bindings.
>>>>>>>> SLF4J: Found binding in [jar:file:/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>>>>>>> SLF4J: Found binding in [jar:file:/hadoop/hbase-0.98.5-hadoop2/lib/phoenix-4.1.0-client-hadoop2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>>>>>>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
>>>>>>>> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
>>>>>>>> 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
>>>>>>>> 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: yarn.nodemanager.loacl-dirs;  Ignoring.
>>>>>>>> 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
>>>>>>>> Execution log at: /tmp/hadoop/hadoop_20141230101717_282ec475-8621-40fa-8178-a7927d81540b.log
>>>>>>>> java.io.FileNotFoundException: File does not exist:hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>>>> at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1128)
>>>>>>>> at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
>>>>>>>> at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>>>>>>>> at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120)
>>>>>>>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
>>>>>>>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
>>>>>>>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:99)
>>>>>>>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
>>>>>>>> at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:265)
>>>>>>>> at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:301)
>>>>>>>> at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:389)
>>>>>>>> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
>>>>>>>> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
>>>>>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>>>>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>>>>> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>>>>>>>> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
>>>>>>>> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
>>>>>>>> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
>>>>>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>>>>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>>>>> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>>>>>>>> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
>>>>>>>> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
>>>>>>>> at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:420)
>>>>>>>> at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:740)
>>>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>>>>>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>>>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>>>>>>> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
>>>>>>>> Job Submission failed with exception 'java.io.FileNotFoundException(File does not exist:hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar)'
>>>>>>>> Execution failed with exit status: 1
>>>>>>>> Obtaining error information
>>>>>>>> Task failed!
>>>>>>>> Task ID:
>>>>>>>>   Stage-1
>>>>>>>> Logs:
>>>>>>>> /tmp/hadoop/hive.log
>>>>>>>> FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Step 5: (check the file)
>>>>>>>> hive> dfs -ls /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar;
>>>>>>>> ls: `/tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar': No such file or directory
>>>>>>>> Command failed with exit code = 1
>>>>>>>> Query returned non-zero code: 1, cause: null
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
>>>>>> 
>>>>>> 
>>>>>> CONFIDENTIALITY NOTICE
>>>>>> NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
>>>>> 
>>>> 
>>> 
>>> 
>>> CONFIDENTIALITY NOTICE
>>> NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
>> 
> 
> 
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.

Re: CREATE FUNCTION: How to automatically load extra jar file?

Posted by Jason Dere <jd...@hortonworks.com>.

Does your hive.log contain any lines with "adding libjars:"?

May also search for any lines containing "_resources", would like to see the result of both searches.

For example, mine is showing the following line:
2015-01-06 14:53:28,115 INFO  mr.ExecDriver (ExecDriver.java:execute(307)) - adding libjars: file:///tmp/d0ed1585-d9e6-4944-b985-225351574de0_resources/spatial-sdk-hive-1.0.3-SNAPSHOT.jar,file:///tmp/d0ed1585-d9e6-4944-b985-225351574de0_resources/esri-geometry-api.jar

I wonder if your libjars setting for the map/reduce job is somehow getting sent without the "file:///", which might be causing hadoop to interpret the path as a HDFS path rather than a local path.

On Jan 6, 2015, at 1:11 AM, Arthur.hk.chan <ar...@gmail.com> wrote:

> Hi,
> 
> my hadoop’s core-site.xml contains following about tmp
> 
> <property>
>   <name>hadoop.tmp.dir</name>
>   <value>/hadoop_data/hadoop_data/tmp</value>
> </property>
> 
> 
> 
> my hive-default.xml contains following about tmp
> 
> <property>
>   <name>hive.exec.scratchdir</name>
>   <value>/tmp/hive-${user.name}</value>
>   <description>Scratch space for Hive jobs</description>
> </property>
> 
> <property>
>   <name>hive.exec.local.scratchdir</name>
>   <value>/tmp/${user.name}</value>
>   <description>Local scratch space for Hive jobs</description>
> </property>
> 
> 
> 
> Will this related to configuration issue or a bug?
> 
> Please help!
> 
> Regards
> Arthur
> 
> 
> On 6 Jan, 2015, at 3:45 am, Jason Dere <jd...@hortonworks.com> wrote:
> 
>> During query compilation Hive needs to instantiate the UDF class and so the JAR needs to be resolvable by the class loader, thus the JAR is copied locally to a temp location for use.
>> During map/reduce jobs the local jar (like all jars added with the ADD JAR command) should then be added to the distributed cache. It looks like this is where the issue is occurring, but based on path in the error message I suspect that either Hive or Hadoop is mistaking what should be a local path with an HDFS path.
>> 
>> On Jan 4, 2015, at 10:23 AM, Arthur.hk.chan@gmail.com <ar...@gmail.com> wrote:
>> 
>>> Hi,
>>> 
>>> A question: Why does it need to copy the jar file to the temp folder? Why couldn’t it use the file defined in using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar' directly? 
>>> 
>>> Regards
>>> Arthur
>>> 
>>> 
>>> On 4 Jan, 2015, at 7:48 am, Arthur.hk.chan@gmail.com <ar...@gmail.com> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> 
>>>> A1: Are all of these commands (Step 1-5) from the same Hive CLI prompt?
>>>> Yes
>>>> 
>>>> A2:  Would you be able to check if such a file exists with the same path, on the local file system?
>>>> The file does not exist on the local file system.  
>>>> 
>>>> 
>>>> Is there a way to set the another “tmp" folder for HIVE? or any suggestions to fix this issue?
>>>> 
>>>> Thanks !!
>>>> 
>>>> Arthur
>>>>  
>>>> 
>>>> 
>>>> On 3 Jan, 2015, at 4:12 am, Jason Dere <jd...@hortonworks.com> wrote:
>>>> 
>>>>> The point of USING JAR as part of the CREATE FUNCTION statement to try to avoid having to do ADD JAR/aux path stuff to get the UDF to work. 
>>>>> 
>>>>> Are all of these commands (Step 1-5) from the same Hive CLI prompt?
>>>>> 
>>>>>>> hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar';
>>>>>>> converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>>> Added /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar to class path
>>>>>>> Added resource: /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>>> OK
>>>>> 
>>>>> 
>>>>> One note, /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar here should actually be on the local file system, not on HDFS where you were checking in Step 5. During CREATE FUNCTION/query compilation, Hive will make a copy of the source JAR (hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar), copied to a temp location on the local file system where it's used by that Hive session.
>>>>> 
>>>>> The location mentioned in the FileNotFoundException (hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar) has a different path than the local copy mentioned during CREATE FUNCTION (/tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar). I'm not really sure why it is a HDFS path here either, but I'm not too familiar with what goes on during the job submission process. But the fact that this HDFS path has the same naming convention as the directory used for downloading resources locally (***_resources) looks a little fishy to me. Would you be able to check if such a file exists with the same path, on the local file system?
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> On Dec 31, 2014, at 5:22 AM, Nirmal Kumar <ni...@impetus.co.in> wrote:
>>>>> 
>>>>>>   Important: HiveQL's ADD JAR operation does not work with HiveServer2 and the Beeline client when Beeline runs on a different host. As an alterntive to ADD JAR, Hive auxiliary path functionality should be used as described below.
>>>>>> 
>>>>>> Refer:
>>>>>> http://www.cloudera.com/content/cloudera/en/documentation/cloudera-manager/v4-8-0/Cloudera-Manager-Managing-Clusters/cmmc_hive_udf.html
>>>>>> 
>>>>>> 
>>>>>> Thanks,
>>>>>> -Nirmal
>>>>>> 
>>>>>> From: Arthur.hk.chan@gmail.com <ar...@gmail.com>
>>>>>> Sent: Tuesday, December 30, 2014 9:54 PM
>>>>>> To: vic0777
>>>>>> Cc: Arthur.hk.chan@gmail.com; user@hive.apache.org
>>>>>> Subject: Re: CREATE FUNCTION: How to automatically load extra jar file?
>>>>>>  
>>>>>> Thank you.
>>>>>> 
>>>>>> Will this work for hiveserver2 ?
>>>>>> 
>>>>>> 
>>>>>> Arthur
>>>>>> 
>>>>>> On 30 Dec, 2014, at 2:24 pm, vic0777 <vi...@163.com> wrote:
>>>>>> 
>>>>>>> 
>>>>>>> You can put it into $HOME/.hiverc like this: ADD JAR full_path_of_the_jar. Then, the file is automatically loaded when Hive is started.
>>>>>>> 
>>>>>>> Wantao
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> At 2014-12-30 11:01:06, "Arthur.hk.chan@gmail.com" <ar...@gmail.com> wrote:
>>>>>>> Hi,
>>>>>>> 
>>>>>>> I am using Hive 0.13.1 on Hadoop 2.4.1, I need to automatically load an extra JAR file to hive for UDF, below are my steps to create the UDF function. I have tried the following but still no luck to get thru.
>>>>>>> 
>>>>>>> Please help!!
>>>>>>> 
>>>>>>> Regards
>>>>>>> Arthur
>>>>>>> 
>>>>>>> 
>>>>>>> Step 1:   (make sure the jar in in HDFS)
>>>>>>> hive> dfs -ls hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar;
>>>>>>> -rw-r--r--   3 hadoop hadoop      57388 2014-12-30 10:02hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>>> 
>>>>>>> Step 2: (drop if function exists) 
>>>>>>> hive> drop function sysdate;                                                  
>>>>>>> OK
>>>>>>> Time taken: 0.013 seconds
>>>>>>> 
>>>>>>> Step 3: (create function using the jar in HDFS)
>>>>>>> hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar';
>>>>>>> converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>>> Added /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar to class path
>>>>>>> Added resource: /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>>> OK
>>>>>>> Time taken: 0.034 seconds
>>>>>>> 
>>>>>>> Step 4: (test)
>>>>>>> hive> select sysdate();                                                                                                                                
>>>>>>> Automatically selecting local only mode for query
>>>>>>> Total jobs = 1
>>>>>>> Launching Job 1 out of 1
>>>>>>> Number of reduce tasks is set to 0 since there's no reduce operator
>>>>>>> SLF4J: Class path contains multiple SLF4J bindings.
>>>>>>> SLF4J: Found binding in [jar:file:/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>>>>>> SLF4J: Found binding in [jar:file:/hadoop/hbase-0.98.5-hadoop2/lib/phoenix-4.1.0-client-hadoop2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>>>>>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
>>>>>>> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
>>>>>>> 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
>>>>>>> 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: yarn.nodemanager.loacl-dirs;  Ignoring.
>>>>>>> 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
>>>>>>> Execution log at: /tmp/hadoop/hadoop_20141230101717_282ec475-8621-40fa-8178-a7927d81540b.log
>>>>>>> java.io.FileNotFoundException: File does not exist:hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>>> at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1128)
>>>>>>> at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
>>>>>>> at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>>>>>>> at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120)
>>>>>>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
>>>>>>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
>>>>>>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:99)
>>>>>>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
>>>>>>> at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:265)
>>>>>>> at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:301)
>>>>>>> at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:389)
>>>>>>> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
>>>>>>> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
>>>>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>>>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>>>> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>>>>>>> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
>>>>>>> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
>>>>>>> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
>>>>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>>>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>>>> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>>>>>>> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
>>>>>>> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
>>>>>>> at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:420)
>>>>>>> at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:740)
>>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>>>>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>>>>>> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
>>>>>>> Job Submission failed with exception 'java.io.FileNotFoundException(File does not exist:hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar)'
>>>>>>> Execution failed with exit status: 1
>>>>>>> Obtaining error information
>>>>>>> Task failed!
>>>>>>> Task ID:
>>>>>>>   Stage-1
>>>>>>> Logs:
>>>>>>> /tmp/hadoop/hive.log
>>>>>>> FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>>>>>>> 
>>>>>>> 
>>>>>>> Step 5: (check the file)
>>>>>>> hive> dfs -ls /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar;
>>>>>>> ls: `/tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar': No such file or directory
>>>>>>> Command failed with exit code = 1
>>>>>>> Query returned non-zero code: 1, cause: null
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
>>>>> 
>>>>> 
>>>>> CONFIDENTIALITY NOTICE
>>>>> NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
>>>> 
>>> 
>> 
>> 
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
> 


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: CREATE FUNCTION: How to automatically load extra jar file?

Posted by "Arthur.hk.chan" <ar...@gmail.com>.

Hi,

my hadoop’s core-site.xml contains following about tmp

<property>
  <name>hadoop.tmp.dir</name>
  <value>/hadoop_data/hadoop_data/tmp</value>
</property>



my hive-default.xml contains following about tmp

<property>
  <name>hive.exec.scratchdir</name>
  <value>/tmp/hive-${user.name}</value>
  <description>Scratch space for Hive jobs</description>
</property>

<property>
  <name>hive.exec.local.scratchdir</name>
  <value>/tmp/${user.name}</value>
  <description>Local scratch space for Hive jobs</description>
</property>



Will this related to configuration issue or a bug?

Please help!

Regards
Arthur


On 6 Jan, 2015, at 3:45 am, Jason Dere <jd...@hortonworks.com> wrote:

> During query compilation Hive needs to instantiate the UDF class and so the JAR needs to be resolvable by the class loader, thus the JAR is copied locally to a temp location for use.
> During map/reduce jobs the local jar (like all jars added with the ADD JAR command) should then be added to the distributed cache. It looks like this is where the issue is occurring, but based on path in the error message I suspect that either Hive or Hadoop is mistaking what should be a local path with an HDFS path.
> 
> On Jan 4, 2015, at 10:23 AM, Arthur.hk.chan@gmail.com <ar...@gmail.com> wrote:
> 
>> Hi,
>> 
>> A question: Why does it need to copy the jar file to the temp folder? Why couldn’t it use the file defined in using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar' directly? 
>> 
>> Regards
>> Arthur
>> 
>> 
>> On 4 Jan, 2015, at 7:48 am, Arthur.hk.chan@gmail.com <ar...@gmail.com> wrote:
>> 
>>> Hi,
>>> 
>>> 
>>> A1: Are all of these commands (Step 1-5) from the same Hive CLI prompt?
>>> Yes
>>> 
>>> A2:  Would you be able to check if such a file exists with the same path, on the local file system?
>>> The file does not exist on the local file system.  
>>> 
>>> 
>>> Is there a way to set the another “tmp" folder for HIVE? or any suggestions to fix this issue?
>>> 
>>> Thanks !!
>>> 
>>> Arthur
>>>  
>>> 
>>> 
>>> On 3 Jan, 2015, at 4:12 am, Jason Dere <jd...@hortonworks.com> wrote:
>>> 
>>>> The point of USING JAR as part of the CREATE FUNCTION statement to try to avoid having to do ADD JAR/aux path stuff to get the UDF to work. 
>>>> 
>>>> Are all of these commands (Step 1-5) from the same Hive CLI prompt?
>>>> 
>>>>>> hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar';
>>>>>> converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>> Added /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar to class path
>>>>>> Added resource: /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>> OK
>>>> 
>>>> 
>>>> One note, /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar here should actually be on the local file system, not on HDFS where you were checking in Step 5. During CREATE FUNCTION/query compilation, Hive will make a copy of the source JAR (hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar), copied to a temp location on the local file system where it's used by that Hive session.
>>>> 
>>>> The location mentioned in the FileNotFoundException (hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar) has a different path than the local copy mentioned during CREATE FUNCTION (/tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar). I'm not really sure why it is a HDFS path here either, but I'm not too familiar with what goes on during the job submission process. But the fact that this HDFS path has the same naming convention as the directory used for downloading resources locally (***_resources) looks a little fishy to me. Would you be able to check if such a file exists with the same path, on the local file system?
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On Dec 31, 2014, at 5:22 AM, Nirmal Kumar <ni...@impetus.co.in> wrote:
>>>> 
>>>>>   Important: HiveQL's ADD JAR operation does not work with HiveServer2 and the Beeline client when Beeline runs on a different host. As an alterntive to ADD JAR, Hive auxiliary path functionality should be used as described below.
>>>>> 
>>>>> Refer:
>>>>> http://www.cloudera.com/content/cloudera/en/documentation/cloudera-manager/v4-8-0/Cloudera-Manager-Managing-Clusters/cmmc_hive_udf.html
>>>>> 
>>>>> 
>>>>> Thanks,
>>>>> -Nirmal
>>>>> 
>>>>> From: Arthur.hk.chan@gmail.com <ar...@gmail.com>
>>>>> Sent: Tuesday, December 30, 2014 9:54 PM
>>>>> To: vic0777
>>>>> Cc: Arthur.hk.chan@gmail.com; user@hive.apache.org
>>>>> Subject: Re: CREATE FUNCTION: How to automatically load extra jar file?
>>>>>  
>>>>> Thank you.
>>>>> 
>>>>> Will this work for hiveserver2 ?
>>>>> 
>>>>> 
>>>>> Arthur
>>>>> 
>>>>> On 30 Dec, 2014, at 2:24 pm, vic0777 <vi...@163.com> wrote:
>>>>> 
>>>>>> 
>>>>>> You can put it into $HOME/.hiverc like this: ADD JAR full_path_of_the_jar. Then, the file is automatically loaded when Hive is started.
>>>>>> 
>>>>>> Wantao
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> At 2014-12-30 11:01:06, "Arthur.hk.chan@gmail.com" <ar...@gmail.com> wrote:
>>>>>> Hi,
>>>>>> 
>>>>>> I am using Hive 0.13.1 on Hadoop 2.4.1, I need to automatically load an extra JAR file to hive for UDF, below are my steps to create the UDF function. I have tried the following but still no luck to get thru.
>>>>>> 
>>>>>> Please help!!
>>>>>> 
>>>>>> Regards
>>>>>> Arthur
>>>>>> 
>>>>>> 
>>>>>> Step 1:   (make sure the jar in in HDFS)
>>>>>> hive> dfs -ls hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar;
>>>>>> -rw-r--r--   3 hadoop hadoop      57388 2014-12-30 10:02hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>> 
>>>>>> Step 2: (drop if function exists) 
>>>>>> hive> drop function sysdate;                                                  
>>>>>> OK
>>>>>> Time taken: 0.013 seconds
>>>>>> 
>>>>>> Step 3: (create function using the jar in HDFS)
>>>>>> hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar';
>>>>>> converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>> Added /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar to class path
>>>>>> Added resource: /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>> OK
>>>>>> Time taken: 0.034 seconds
>>>>>> 
>>>>>> Step 4: (test)
>>>>>> hive> select sysdate();                                                                                                                               
>>>>>> Automatically selecting local only mode for query
>>>>>> Total jobs = 1
>>>>>> Launching Job 1 out of 1
>>>>>> Number of reduce tasks is set to 0 since there's no reduce operator
>>>>>> SLF4J: Class path contains multiple SLF4J bindings.
>>>>>> SLF4J: Found binding in [jar:file:/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>>>>> SLF4J: Found binding in [jar:file:/hadoop/hbase-0.98.5-hadoop2/lib/phoenix-4.1.0-client-hadoop2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>>>>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
>>>>>> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
>>>>>> 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
>>>>>> 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: yarn.nodemanager.loacl-dirs;  Ignoring.
>>>>>> 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
>>>>>> Execution log at: /tmp/hadoop/hadoop_20141230101717_282ec475-8621-40fa-8178-a7927d81540b.log
>>>>>> java.io.FileNotFoundException: File does not exist:hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>> at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1128)
>>>>>> at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
>>>>>> at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>>>>>> at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120)
>>>>>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
>>>>>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
>>>>>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:99)
>>>>>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
>>>>>> at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:265)
>>>>>> at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:301)
>>>>>> at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:389)
>>>>>> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
>>>>>> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
>>>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>>> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>>>>>> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
>>>>>> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
>>>>>> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
>>>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>>> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>>>>>> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
>>>>>> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
>>>>>> at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:420)
>>>>>> at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:740)
>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>>>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>>>>> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
>>>>>> Job Submission failed with exception 'java.io.FileNotFoundException(File does not exist:hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar)'
>>>>>> Execution failed with exit status: 1
>>>>>> Obtaining error information
>>>>>> Task failed!
>>>>>> Task ID:
>>>>>>   Stage-1
>>>>>> Logs:
>>>>>> /tmp/hadoop/hive.log
>>>>>> FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>>>>>> 
>>>>>> 
>>>>>> Step 5: (check the file)
>>>>>> hive> dfs -ls /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar;
>>>>>> ls: `/tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar': No such file or directory
>>>>>> Command failed with exit code = 1
>>>>>> Query returned non-zero code: 1, cause: null
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
>>>> 
>>>> 
>>>> CONFIDENTIALITY NOTICE
>>>> NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
>>> 
>> 
> 
> 
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.

Re: CREATE FUNCTION: How to automatically load extra jar file?

Posted by Jason Dere <jd...@hortonworks.com>.

During query compilation Hive needs to instantiate the UDF class and so the JAR needs to be resolvable by the class loader, thus the JAR is copied locally to a temp location for use.
During map/reduce jobs the local jar (like all jars added with the ADD JAR command) should then be added to the distributed cache. It looks like this is where the issue is occurring, but based on path in the error message I suspect that either Hive or Hadoop is mistaking what should be a local path with an HDFS path.

On Jan 4, 2015, at 10:23 AM, Arthur.hk.chan@gmail.com <ar...@gmail.com> wrote:

> Hi,
> 
> A question: Why does it need to copy the jar file to the temp folder? Why couldn’t it use the file defined in using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar' directly? 
> 
> Regards
> Arthur
> 
> 
> On 4 Jan, 2015, at 7:48 am, Arthur.hk.chan@gmail.com <ar...@gmail.com> wrote:
> 
>> Hi,
>> 
>> 
>> A1: Are all of these commands (Step 1-5) from the same Hive CLI prompt?
>> Yes
>> 
>> A2:  Would you be able to check if such a file exists with the same path, on the local file system?
>> The file does not exist on the local file system.  
>> 
>> 
>> Is there a way to set the another “tmp" folder for HIVE? or any suggestions to fix this issue?
>> 
>> Thanks !!
>> 
>> Arthur
>>  
>> 
>> 
>> On 3 Jan, 2015, at 4:12 am, Jason Dere <jd...@hortonworks.com> wrote:
>> 
>>> The point of USING JAR as part of the CREATE FUNCTION statement to try to avoid having to do ADD JAR/aux path stuff to get the UDF to work. 
>>> 
>>> Are all of these commands (Step 1-5) from the same Hive CLI prompt?
>>> 
>>>>> hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar';
>>>>> converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>> Added /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar to class path
>>>>> Added resource: /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>> OK
>>> 
>>> 
>>> One note, /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar here should actually be on the local file system, not on HDFS where you were checking in Step 5. During CREATE FUNCTION/query compilation, Hive will make a copy of the source JAR (hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar), copied to a temp location on the local file system where it's used by that Hive session.
>>> 
>>> The location mentioned in the FileNotFoundException (hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar) has a different path than the local copy mentioned during CREATE FUNCTION (/tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar). I'm not really sure why it is a HDFS path here either, but I'm not too familiar with what goes on during the job submission process. But the fact that this HDFS path has the same naming convention as the directory used for downloading resources locally (***_resources) looks a little fishy to me. Would you be able to check if such a file exists with the same path, on the local file system?
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Dec 31, 2014, at 5:22 AM, Nirmal Kumar <ni...@impetus.co.in> wrote:
>>> 
>>>>   Important: HiveQL's ADD JAR operation does not work with HiveServer2 and the Beeline client when Beeline runs on a different host. As an alterntive to ADD JAR, Hive auxiliary path functionality should be used as described below.
>>>> 
>>>> Refer:
>>>> http://www.cloudera.com/content/cloudera/en/documentation/cloudera-manager/v4-8-0/Cloudera-Manager-Managing-Clusters/cmmc_hive_udf.html
>>>> 
>>>> 
>>>> Thanks,
>>>> -Nirmal
>>>> 
>>>> From: Arthur.hk.chan@gmail.com <ar...@gmail.com>
>>>> Sent: Tuesday, December 30, 2014 9:54 PM
>>>> To: vic0777
>>>> Cc: Arthur.hk.chan@gmail.com; user@hive.apache.org
>>>> Subject: Re: CREATE FUNCTION: How to automatically load extra jar file?
>>>>  
>>>> Thank you.
>>>> 
>>>> Will this work for hiveserver2 ?
>>>> 
>>>> 
>>>> Arthur
>>>> 
>>>> On 30 Dec, 2014, at 2:24 pm, vic0777 <vi...@163.com> wrote:
>>>> 
>>>>> 
>>>>> You can put it into $HOME/.hiverc like this: ADD JAR full_path_of_the_jar. Then, the file is automatically loaded when Hive is started.
>>>>> 
>>>>> Wantao
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> At 2014-12-30 11:01:06, "Arthur.hk.chan@gmail.com" <ar...@gmail.com> wrote:
>>>>> Hi,
>>>>> 
>>>>> I am using Hive 0.13.1 on Hadoop 2.4.1, I need to automatically load an extra JAR file to hive for UDF, below are my steps to create the UDF function. I have tried the following but still no luck to get thru.
>>>>> 
>>>>> Please help!!
>>>>> 
>>>>> Regards
>>>>> Arthur
>>>>> 
>>>>> 
>>>>> Step 1:   (make sure the jar in in HDFS)
>>>>> hive> dfs -ls hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar;
>>>>> -rw-r--r--   3 hadoop hadoop      57388 2014-12-30 10:02hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>> 
>>>>> Step 2: (drop if function exists) 
>>>>> hive> drop function sysdate;                                                  
>>>>> OK
>>>>> Time taken: 0.013 seconds
>>>>> 
>>>>> Step 3: (create function using the jar in HDFS)
>>>>> hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar';
>>>>> converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>> Added /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar to class path
>>>>> Added resource: /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>> OK
>>>>> Time taken: 0.034 seconds
>>>>> 
>>>>> Step 4: (test)
>>>>> hive> select sysdate();                                                                                                                                
>>>>> Automatically selecting local only mode for query
>>>>> Total jobs = 1
>>>>> Launching Job 1 out of 1
>>>>> Number of reduce tasks is set to 0 since there's no reduce operator
>>>>> SLF4J: Class path contains multiple SLF4J bindings.
>>>>> SLF4J: Found binding in [jar:file:/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>>>> SLF4J: Found binding in [jar:file:/hadoop/hbase-0.98.5-hadoop2/lib/phoenix-4.1.0-client-hadoop2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>>>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
>>>>> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
>>>>> 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
>>>>> 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: yarn.nodemanager.loacl-dirs;  Ignoring.
>>>>> 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
>>>>> Execution log at: /tmp/hadoop/hadoop_20141230101717_282ec475-8621-40fa-8178-a7927d81540b.log
>>>>> java.io.FileNotFoundException: File does not exist:hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>> at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1128)
>>>>> at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
>>>>> at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>>>>> at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120)
>>>>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
>>>>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
>>>>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:99)
>>>>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
>>>>> at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:265)
>>>>> at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:301)
>>>>> at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:389)
>>>>> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
>>>>> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
>>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>>>>> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
>>>>> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
>>>>> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
>>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>>>>> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
>>>>> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
>>>>> at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:420)
>>>>> at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:740)
>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>>>> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
>>>>> Job Submission failed with exception 'java.io.FileNotFoundException(File does not exist:hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar)'
>>>>> Execution failed with exit status: 1
>>>>> Obtaining error information
>>>>> Task failed!
>>>>> Task ID:
>>>>>   Stage-1
>>>>> Logs:
>>>>> /tmp/hadoop/hive.log
>>>>> FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>>>>> 
>>>>> 
>>>>> Step 5: (check the file)
>>>>> hive> dfs -ls /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar;
>>>>> ls: `/tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar': No such file or directory
>>>>> Command failed with exit code = 1
>>>>> Query returned non-zero code: 1, cause: null
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
>>> 
>>> 
>>> CONFIDENTIALITY NOTICE
>>> NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
>> 
> 


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: CREATE FUNCTION: How to automatically load extra jar file?

Posted by "Arthur.hk.chan@gmail.com" <ar...@gmail.com>.

Hi,

A question: Why does it need to copy the jar file to the temp folder? Why couldn’t it use the file defined in using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar' directly? 

Regards
Arthur


On 4 Jan, 2015, at 7:48 am, Arthur.hk.chan@gmail.com <ar...@gmail.com> wrote:

> Hi,
> 
> 
> A1: Are all of these commands (Step 1-5) from the same Hive CLI prompt?
> Yes
> 
> A2:  Would you be able to check if such a file exists with the same path, on the local file system?
> The file does not exist on the local file system.  
> 
> 
> Is there a way to set the another “tmp" folder for HIVE? or any suggestions to fix this issue?
> 
> Thanks !!
> 
> Arthur
>  
> 
> 
> On 3 Jan, 2015, at 4:12 am, Jason Dere <jd...@hortonworks.com> wrote:
> 
>> The point of USING JAR as part of the CREATE FUNCTION statement to try to avoid having to do ADD JAR/aux path stuff to get the UDF to work. 
>> 
>> Are all of these commands (Step 1-5) from the same Hive CLI prompt?
>> 
>>>> hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar';
>>>> converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>> Added /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar to class path
>>>> Added resource: /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>> OK
>> 
>> 
>> One note, /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar here should actually be on the local file system, not on HDFS where you were checking in Step 5. During CREATE FUNCTION/query compilation, Hive will make a copy of the source JAR (hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar), copied to a temp location on the local file system where it's used by that Hive session.
>> 
>> The location mentioned in the FileNotFoundException (hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar) has a different path than the local copy mentioned during CREATE FUNCTION (/tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar). I'm not really sure why it is a HDFS path here either, but I'm not too familiar with what goes on during the job submission process. But the fact that this HDFS path has the same naming convention as the directory used for downloading resources locally (***_resources) looks a little fishy to me. Would you be able to check if such a file exists with the same path, on the local file system?
>> 
>> 
>> 
>> 
>> 
>> On Dec 31, 2014, at 5:22 AM, Nirmal Kumar <ni...@impetus.co.in> wrote:
>> 
>>>   Important: HiveQL's ADD JAR operation does not work with HiveServer2 and the Beeline client when Beeline runs on a different host. As an alterntive to ADD JAR, Hive auxiliary path functionality should be used as described below.
>>> 
>>> Refer:
>>> http://www.cloudera.com/content/cloudera/en/documentation/cloudera-manager/v4-8-0/Cloudera-Manager-Managing-Clusters/cmmc_hive_udf.html
>>> 
>>> 
>>> Thanks,
>>> -Nirmal
>>> 
>>> From: Arthur.hk.chan@gmail.com <ar...@gmail.com>
>>> Sent: Tuesday, December 30, 2014 9:54 PM
>>> To: vic0777
>>> Cc: Arthur.hk.chan@gmail.com; user@hive.apache.org
>>> Subject: Re: CREATE FUNCTION: How to automatically load extra jar file?
>>>  
>>> Thank you.
>>> 
>>> Will this work for hiveserver2 ?
>>> 
>>> 
>>> Arthur
>>> 
>>> On 30 Dec, 2014, at 2:24 pm, vic0777 <vi...@163.com> wrote:
>>> 
>>>> 
>>>> You can put it into $HOME/.hiverc like this: ADD JAR full_path_of_the_jar. Then, the file is automatically loaded when Hive is started.
>>>> 
>>>> Wantao
>>>> 
>>>> 
>>>> 
>>>> 
>>>> At 2014-12-30 11:01:06, "Arthur.hk.chan@gmail.com" <ar...@gmail.com> wrote:
>>>> Hi,
>>>> 
>>>> I am using Hive 0.13.1 on Hadoop 2.4.1, I need to automatically load an extra JAR file to hive for UDF, below are my steps to create the UDF function. I have tried the following but still no luck to get thru.
>>>> 
>>>> Please help!!
>>>> 
>>>> Regards
>>>> Arthur
>>>> 
>>>> 
>>>> Step 1:   (make sure the jar in in HDFS)
>>>> hive> dfs -ls hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar;
>>>> -rw-r--r--   3 hadoop hadoop      57388 2014-12-30 10:02hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>> 
>>>> Step 2: (drop if function exists) 
>>>> hive> drop function sysdate;                                                  
>>>> OK
>>>> Time taken: 0.013 seconds
>>>> 
>>>> Step 3: (create function using the jar in HDFS)
>>>> hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar';
>>>> converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>> Added /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar to class path
>>>> Added resource: /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>> OK
>>>> Time taken: 0.034 seconds
>>>> 
>>>> Step 4: (test)
>>>> hive> select sysdate();                                                                                                                                
>>>> Automatically selecting local only mode for query
>>>> Total jobs = 1
>>>> Launching Job 1 out of 1
>>>> Number of reduce tasks is set to 0 since there's no reduce operator
>>>> SLF4J: Class path contains multiple SLF4J bindings.
>>>> SLF4J: Found binding in [jar:file:/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>>> SLF4J: Found binding in [jar:file:/hadoop/hbase-0.98.5-hadoop2/lib/phoenix-4.1.0-client-hadoop2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
>>>> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
>>>> 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
>>>> 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: yarn.nodemanager.loacl-dirs;  Ignoring.
>>>> 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
>>>> Execution log at: /tmp/hadoop/hadoop_20141230101717_282ec475-8621-40fa-8178-a7927d81540b.log
>>>> java.io.FileNotFoundException: File does not exist:hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>> at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1128)
>>>> at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
>>>> at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>>>> at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120)
>>>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
>>>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
>>>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:99)
>>>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
>>>> at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:265)
>>>> at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:301)
>>>> at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:389)
>>>> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
>>>> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>>> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>>>> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
>>>> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
>>>> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>>> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>>>> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
>>>> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
>>>> at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:420)
>>>> at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:740)
>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>>> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
>>>> Job Submission failed with exception 'java.io.FileNotFoundException(File does not exist:hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar)'
>>>> Execution failed with exit status: 1
>>>> Obtaining error information
>>>> Task failed!
>>>> Task ID:
>>>>   Stage-1
>>>> Logs:
>>>> /tmp/hadoop/hive.log
>>>> FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>>>> 
>>>> 
>>>> Step 5: (check the file)
>>>> hive> dfs -ls /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar;
>>>> ls: `/tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar': No such file or directory
>>>> Command failed with exit code = 1
>>>> Query returned non-zero code: 1, cause: null
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
>> 
>> 
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
>

Re: CREATE FUNCTION: How to automatically load extra jar file?

Posted by "Arthur.hk.chan@gmail.com" <ar...@gmail.com>.

Hi,


A1: Are all of these commands (Step 1-5) from the same Hive CLI prompt?
Yes

A2:  Would you be able to check if such a file exists with the same path, on the local file system?
The file does not exist on the local file system.  


Is there a way to set the another “tmp" folder for HIVE? or any suggestions to fix this issue?

Thanks !!

Arthur
 


On 3 Jan, 2015, at 4:12 am, Jason Dere <jd...@hortonworks.com> wrote:

> The point of USING JAR as part of the CREATE FUNCTION statement to try to avoid having to do ADD JAR/aux path stuff to get the UDF to work. 
> 
> Are all of these commands (Step 1-5) from the same Hive CLI prompt?
> 
>>> hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar';
>>> converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>>> Added /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar to class path
>>> Added resource: /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>>> OK
> 
> 
> One note, /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar here should actually be on the local file system, not on HDFS where you were checking in Step 5. During CREATE FUNCTION/query compilation, Hive will make a copy of the source JAR (hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar), copied to a temp location on the local file system where it's used by that Hive session.
> 
> The location mentioned in the FileNotFoundException (hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar) has a different path than the local copy mentioned during CREATE FUNCTION (/tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar). I'm not really sure why it is a HDFS path here either, but I'm not too familiar with what goes on during the job submission process. But the fact that this HDFS path has the same naming convention as the directory used for downloading resources locally (***_resources) looks a little fishy to me. Would you be able to check if such a file exists with the same path, on the local file system?
> 
> 
> 
> 
> 
> On Dec 31, 2014, at 5:22 AM, Nirmal Kumar <ni...@impetus.co.in> wrote:
> 
>>   Important: HiveQL's ADD JAR operation does not work with HiveServer2 and the Beeline client when Beeline runs on a different host. As an alterntive to ADD JAR, Hive auxiliary path functionality should be used as described below.
>> 
>> Refer:
>> http://www.cloudera.com/content/cloudera/en/documentation/cloudera-manager/v4-8-0/Cloudera-Manager-Managing-Clusters/cmmc_hive_udf.html
>> 
>> 
>> Thanks,
>> -Nirmal
>> 
>> From: Arthur.hk.chan@gmail.com <ar...@gmail.com>
>> Sent: Tuesday, December 30, 2014 9:54 PM
>> To: vic0777
>> Cc: Arthur.hk.chan@gmail.com; user@hive.apache.org
>> Subject: Re: CREATE FUNCTION: How to automatically load extra jar file?
>>  
>> Thank you.
>> 
>> Will this work for hiveserver2 ?
>> 
>> 
>> Arthur
>> 
>> On 30 Dec, 2014, at 2:24 pm, vic0777 <vi...@163.com> wrote:
>> 
>>> 
>>> You can put it into $HOME/.hiverc like this: ADD JAR full_path_of_the_jar. Then, the file is automatically loaded when Hive is started.
>>> 
>>> Wantao
>>> 
>>> 
>>> 
>>> 
>>> At 2014-12-30 11:01:06, "Arthur.hk.chan@gmail.com" <ar...@gmail.com> wrote:
>>> Hi,
>>> 
>>> I am using Hive 0.13.1 on Hadoop 2.4.1, I need to automatically load an extra JAR file to hive for UDF, below are my steps to create the UDF function. I have tried the following but still no luck to get thru.
>>> 
>>> Please help!!
>>> 
>>> Regards
>>> Arthur
>>> 
>>> 
>>> Step 1:   (make sure the jar in in HDFS)
>>> hive> dfs -ls hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar;
>>> -rw-r--r--   3 hadoop hadoop      57388 2014-12-30 10:02hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>>> 
>>> Step 2: (drop if function exists) 
>>> hive> drop function sysdate;                                                  
>>> OK
>>> Time taken: 0.013 seconds
>>> 
>>> Step 3: (create function using the jar in HDFS)
>>> hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar';
>>> converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>>> Added /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar to class path
>>> Added resource: /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>>> OK
>>> Time taken: 0.034 seconds
>>> 
>>> Step 4: (test)
>>> hive> select sysdate();                                                                                                                                
>>> Automatically selecting local only mode for query
>>> Total jobs = 1
>>> Launching Job 1 out of 1
>>> Number of reduce tasks is set to 0 since there's no reduce operator
>>> SLF4J: Class path contains multiple SLF4J bindings.
>>> SLF4J: Found binding in [jar:file:/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>> SLF4J: Found binding in [jar:file:/hadoop/hbase-0.98.5-hadoop2/lib/phoenix-4.1.0-client-hadoop2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
>>> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
>>> 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
>>> 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: yarn.nodemanager.loacl-dirs;  Ignoring.
>>> 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
>>> Execution log at: /tmp/hadoop/hadoop_20141230101717_282ec475-8621-40fa-8178-a7927d81540b.log
>>> java.io.FileNotFoundException: File does not exist:hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>>> at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1128)
>>> at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
>>> at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>>> at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120)
>>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
>>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
>>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:99)
>>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
>>> at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:265)
>>> at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:301)
>>> at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:389)
>>> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
>>> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
>>> at java.security.AccessController.doPrivileged(Native Method)
>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>>> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
>>> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
>>> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
>>> at java.security.AccessController.doPrivileged(Native Method)
>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>>> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
>>> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
>>> at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:420)
>>> at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:740)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
>>> Job Submission failed with exception 'java.io.FileNotFoundException(File does not exist:hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar)'
>>> Execution failed with exit status: 1
>>> Obtaining error information
>>> Task failed!
>>> Task ID:
>>>   Stage-1
>>> Logs:
>>> /tmp/hadoop/hive.log
>>> FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>>> 
>>> 
>>> Step 5: (check the file)
>>> hive> dfs -ls /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar;
>>> ls: `/tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar': No such file or directory
>>> Command failed with exit code = 1
>>> Query returned non-zero code: 1, cause: null
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
> 
> 
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.

Re: CREATE FUNCTION: How to automatically load extra jar file?

Posted by Jason Dere <jd...@hortonworks.com>.

The point of USING JAR as part of the CREATE FUNCTION statement to try to avoid having to do ADD JAR/aux path stuff to get the UDF to work. 

Are all of these commands (Step 1-5) from the same Hive CLI prompt?

>> hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar';
>> converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>> Added /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar to class path
>> Added resource: /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>> OK


One note, /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar here should actually be on the local file system, not on HDFS where you were checking in Step 5. During CREATE FUNCTION/query compilation, Hive will make a copy of the source JAR (hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar), copied to a temp location on the local file system where it's used by that Hive session.

The location mentioned in the FileNotFoundException (hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar) has a different path than the local copy mentioned during CREATE FUNCTION (/tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar). I'm not really sure why it is a HDFS path here either, but I'm not too familiar with what goes on during the job submission process. But the fact that this HDFS path has the same naming convention as the directory used for downloading resources locally (***_resources) looks a little fishy to me. Would you be able to check if such a file exists with the same path, on the local file system?





On Dec 31, 2014, at 5:22 AM, Nirmal Kumar <ni...@impetus.co.in> wrote:

>   Important: HiveQL's ADD JAR operation does not work with HiveServer2 and the Beeline client when Beeline runs on a different host. As an alterntive to ADD JAR, Hive auxiliary path functionality should be used as described below.
> 
> Refer:
> http://www.cloudera.com/content/cloudera/en/documentation/cloudera-manager/v4-8-0/Cloudera-Manager-Managing-Clusters/cmmc_hive_udf.html
> 
> 
> Thanks,
> -Nirmal
> 
> From: Arthur.hk.chan@gmail.com <ar...@gmail.com>
> Sent: Tuesday, December 30, 2014 9:54 PM
> To: vic0777
> Cc: Arthur.hk.chan@gmail.com; user@hive.apache.org
> Subject: Re: CREATE FUNCTION: How to automatically load extra jar file?
>  
> Thank you.
> 
> Will this work for hiveserver2 ?
> 
> 
> Arthur
> 
> On 30 Dec, 2014, at 2:24 pm, vic0777 <vi...@163.com> wrote:
> 
>> 
>> You can put it into $HOME/.hiverc like this: ADD JAR full_path_of_the_jar. Then, the file is automatically loaded when Hive is started.
>> 
>> Wantao
>> 
>> 
>> 
>> 
>> At 2014-12-30 11:01:06, "Arthur.hk.chan@gmail.com" <ar...@gmail.com> wrote:
>> Hi,
>> 
>> I am using Hive 0.13.1 on Hadoop 2.4.1, I need to automatically load an extra JAR file to hive for UDF, below are my steps to create the UDF function. I have tried the following but still no luck to get thru.
>> 
>> Please help!!
>> 
>> Regards
>> Arthur
>> 
>> 
>> Step 1:   (make sure the jar in in HDFS)
>> hive> dfs -ls hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar;
>> -rw-r--r--   3 hadoop hadoop      57388 2014-12-30 10:02hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>> 
>> Step 2: (drop if function exists) 
>> hive> drop function sysdate;                                                  
>> OK
>> Time taken: 0.013 seconds
>> 
>> Step 3: (create function using the jar in HDFS)
>> hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar';
>> converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>> Added /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar to class path
>> Added resource: /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>> OK
>> Time taken: 0.034 seconds
>> 
>> Step 4: (test)
>> hive> select sysdate();                                                                                                                                
>> Automatically selecting local only mode for query
>> Total jobs = 1
>> Launching Job 1 out of 1
>> Number of reduce tasks is set to 0 since there's no reduce operator
>> SLF4J: Class path contains multiple SLF4J bindings.
>> SLF4J: Found binding in [jar:file:/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>> SLF4J: Found binding in [jar:file:/hadoop/hbase-0.98.5-hadoop2/lib/phoenix-4.1.0-client-hadoop2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
>> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
>> 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
>> 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: yarn.nodemanager.loacl-dirs;  Ignoring.
>> 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
>> Execution log at: /tmp/hadoop/hadoop_20141230101717_282ec475-8621-40fa-8178-a7927d81540b.log
>> java.io.FileNotFoundException: File does not exist:hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>> at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1128)
>> at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
>> at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>> at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120)
>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:99)
>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
>> at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:265)
>> at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:301)
>> at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:389)
>> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
>> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at javax.security.auth.Subject.doAs(Subject.java:415)
>> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
>> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
>> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at javax.security.auth.Subject.doAs(Subject.java:415)
>> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
>> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
>> at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:420)
>> at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:740)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:606)
>> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
>> Job Submission failed with exception 'java.io.FileNotFoundException(File does not exist:hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar)'
>> Execution failed with exit status: 1
>> Obtaining error information
>> Task failed!
>> Task ID:
>>   Stage-1
>> Logs:
>> /tmp/hadoop/hive.log
>> FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>> 
>> 
>> Step 5: (check the file)
>> hive> dfs -ls /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar;
>> ls: `/tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar': No such file or directory
>> Command failed with exit code = 1
>> Query returned non-zero code: 1, cause: null
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
> 
> 
> 
> 
> 
> 
> 
> 
> NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

RE: CREATE FUNCTION: How to automatically load extra jar file?

Posted by Nirmal Kumar <ni...@impetus.co.in>.

[http://www.cloudera.com/content/cloudera/en/documentation/cloudera-manager/v5-1-x/static/important.jpg]  Important: HiveQL's ADD JAR operation does not work with HiveServer2 and the Beeline client when Beeline runs on a different host. As an alterntive to ADD JAR, Hive auxiliary path functionality should be used as described below.


Refer:

http://www.cloudera.com/content/cloudera/en/documentation/cloudera-manager/v4-8-0/Cloudera-Manager-Managing-Clusters/cmmc_hive_udf.html



Thanks,
-Nirmal

________________________________
From: Arthur.hk.chan@gmail.com <ar...@gmail.com>
Sent: Tuesday, December 30, 2014 9:54 PM
To: vic0777
Cc: Arthur.hk.chan@gmail.com; user@hive.apache.org
Subject: Re: CREATE FUNCTION: How to automatically load extra jar file?

Thank you.

Will this work for hiveserver2 ?


Arthur

On 30 Dec, 2014, at 2:24 pm, vic0777 <vi...@163.com>> wrote:


You can put it into $HOME/.hiverc like this: ADD JAR full_path_of_the_jar. Then, the file is automatically loaded when Hive is started.

Wantao




At 2014-12-30 11:01:06, "Arthur.hk.chan@gmail.com<ma...@gmail.com>" <ar...@gmail.com>> wrote:
Hi,

I am using Hive 0.13.1 on Hadoop 2.4.1, I need to automatically load an extra JAR file to hive for UDF, below are my steps to create the UDF function. I have tried the following but still no luck to get thru.

Please help!!

Regards
Arthur


Step 1:   (make sure the jar in in HDFS)
hive> dfs -ls hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar;
-rw-r--r--   3 hadoop hadoop      57388 2014-12-30 10:02 hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar

Step 2: (drop if function exists)
hive> drop function sysdate;
OK
Time taken: 0.013 seconds

Step 3: (create function using the jar in HDFS)
hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar';
converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
Added /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar to class path
Added resource: /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
OK
Time taken: 0.034 seconds

Step 4: (test)
hive> select sysdate();
Automatically selecting local only mode for query
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/hadoop/hbase-0.98.5-hadoop2/lib/phoenix-4.1.0-client-hadoop2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: yarn.nodemanager.loacl-dirs;  Ignoring.
14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
Execution log at: /tmp/hadoop/hadoop_20141230101717_282ec475-8621-40fa-8178-a7927d81540b.log
java.io.FileNotFoundException: File does not exist: hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1128)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:99)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:265)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:301)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:389)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:420)
at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:740)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Job Submission failed with exception 'java.io.FileNotFoundException(File does not exist: hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar)'
Execution failed with exit status: 1
Obtaining error information
Task failed!
Task ID:
  Stage-1
Logs:
/tmp/hadoop/hive.log
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask


Step 5: (check the file)
hive> dfs -ls /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar;
ls: `/tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar': No such file or directory
Command failed with exit code = 1
Query returned non-zero code: 1, cause: null











________________________________






NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.

Re: CREATE FUNCTION: How to automatically load extra jar file?

Posted by "Arthur.hk.chan@gmail.com" <ar...@gmail.com>.

Thank you.

Will this work for hiveserver2 ?


Arthur

On 30 Dec, 2014, at 2:24 pm, vic0777 <vi...@163.com> wrote:

> 
> You can put it into $HOME/.hiverc like this: ADD JAR full_path_of_the_jar. Then, the file is automatically loaded when Hive is started.
> 
> Wantao
> 
> 
> 
> 
> At 2014-12-30 11:01:06, "Arthur.hk.chan@gmail.com" <ar...@gmail.com> wrote:
> Hi,
> 
> I am using Hive 0.13.1 on Hadoop 2.4.1, I need to automatically load an extra JAR file to hive for UDF, below are my steps to create the UDF function. I have tried the following but still no luck to get thru.
> 
> Please help!!
> 
> Regards
> Arthur
> 
> 
> Step 1:   (make sure the jar in in HDFS)
> hive> dfs -ls hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar;
> -rw-r--r--   3 hadoop hadoop      57388 2014-12-30 10:02 hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
> 
> Step 2: (drop if function exists) 
> hive> drop function sysdate;                                                  
> OK
> Time taken: 0.013 seconds
> 
> Step 3: (create function using the jar in HDFS)
> hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar';
> converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
> Added /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar to class path
> Added resource: /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
> OK
> Time taken: 0.034 seconds
> 
> Step 4: (test)
> hive> select sysdate();                                                                                                                                
> Automatically selecting local only mode for query
> Total jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks is set to 0 since there's no reduce operator
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in [jar:file:/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in [jar:file:/hadoop/hbase-0.98.5-hadoop2/lib/phoenix-4.1.0-client-hadoop2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
> 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: yarn.nodemanager.loacl-dirs;  Ignoring.
> 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
> Execution log at: /tmp/hadoop/hadoop_20141230101717_282ec475-8621-40fa-8178-a7927d81540b.log
> java.io.FileNotFoundException: File does not exist: hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
> 	at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1128)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
> 	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120)
> 	at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
> 	at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
> 	at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:99)
> 	at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
> 	at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:265)
> 	at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:301)
> 	at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:389)
> 	at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
> 	at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
> 	at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
> 	at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
> 	at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
> 	at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
> 	at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
> 	at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:420)
> 	at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:740)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:606)
> 	at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> Job Submission failed with exception 'java.io.FileNotFoundException(File does not exist: hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar)'
> Execution failed with exit status: 1
> Obtaining error information
> Task failed!
> Task ID:
>   Stage-1
> Logs:
> /tmp/hadoop/hive.log
> FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> 
> 
> Step 5: (check the file)
> hive> dfs -ls /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar;
> ls: `/tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar': No such file or directory
> Command failed with exit code = 1
> Query returned non-zero code: 1, cause: null
> 
> 
> 
> 
> 
> 
> 
> 
>