You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hive.apache.org by mp...@gmail.com on 2013/12/02 22:13:50 UTC

How to specify Hive auxiliary jar in HDFS, not local file system

Is it possible to specify a Hive auxiliary jar (like a SerDe) that is in
HDFS rather than the local fileystem?

I am using a CsvSerDe I wrote and when I specify it Hive hive.aux.jars.path
with a local file system path it works:

hive -hiveconf hive.aux.jars.path=*file:*///path/to/truven-hive-serdes-1.0.jar
-hiveconf hive.auto.convert.join.noconditionaltask.size=25000000 -f
hivefiscalyearqueries.sql

But when I put that jar in HDFS and point it to, it fails:

hive -hiveconf hive.aux.jars.path=*hdfs:*///hdfspath/to/truven-hive-serdes-1.0.jar
-hiveconf hive.auto.convert.join.noconditionaltask.size=25000000 -f
hivefiscalyearqueries.sql

with the error message:

java.lang.ClassNotFoundException: com.truven.hiveserde.csv.CsvSerDe
Continuing ...
2013-12-02 03:48:25     Starting to launch local task to process map join;
     maximum memory = 1065484288
org.apache.hadoop.hive.ql.metadata.HiveException: Failed with exception
nulljava.lang.NullPointerException
        at
org.apache.hadoop.hive.ql.exec.FetchOperator.getRowInspectorFromTable(FetchOperator.java:230)
        at
org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:595)
        at
org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:406)
        at
org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:290)
        at
org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:682)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:160)

        at
org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:631)
        at
org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:406)
        at
org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:290)
        at
org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:682)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
Execution failed with exit status: 2
Obtaining error information

Task failed!



I will need to run this from Oozie eventually, so I'd like to know how get
Hive to use a jar in HDFS, rather than have to distribute the file to the
local file system of all datanodes.

Thank you,
Michael

Re: How to specify Hive auxiliary jar in HDFS, not local file system

Posted by mp...@gmail.com.

Thanks.  I just got an Oozie Hive action set up to test on a single node
cluster and putting "ADD JAR /path/to/hdfs/location" in the hive script
worked. Hopefully I won't hit any issues when I try it on a multi-node
cluster.


On Mon, Dec 2, 2013 at 5:37 PM, Adam Kawa <ka...@gmail.com> wrote:

> You can use ADD JAR command inside a Hive script and a parameter in Oozie
> workflow definition. Example is here:
> http://blog.cloudera.com/blog/2013/01/how-to-schedule-recurring-hadoop-jobs-with-apache-oozie/
>
>
> 2013/12/2 <mp...@gmail.com>
>
> Is it possible to specify a Hive auxiliary jar (like a SerDe) that is in
>> HDFS rather than the local fileystem?
>>
>> I am using a CsvSerDe I wrote and when I specify it Hive
>> hive.aux.jars.path with a local file system path it works:
>>
>> hive -hiveconf hive.aux.jars.path=*file:*///path/to/truven-hive-serdes-1.0.jar
>> -hiveconf hive.auto.convert.join.noconditionaltask.size=25000000 -f
>> hivefiscalyearqueries.sql
>>
>> But when I put that jar in HDFS and point it to, it fails:
>>
>> hive -hiveconf hive.aux.jars.path=*hdfs:*///hdfspath/to/truven-hive-serdes-1.0.jar
>> -hiveconf hive.auto.convert.join.noconditionaltask.size=25000000 -f
>> hivefiscalyearqueries.sql
>>
>> with the error message:
>>
>> java.lang.ClassNotFoundException: com.truven.hiveserde.csv.CsvSerDe
>> Continuing ...
>> 2013-12-02 03:48:25     Starting to launch local task to process map
>> join;      maximum memory = 1065484288
>> org.apache.hadoop.hive.ql.metadata.HiveException: Failed with exception
>> nulljava.lang.NullPointerException
>>         at
>> org.apache.hadoop.hive.ql.exec.FetchOperator.getRowInspectorFromTable(FetchOperator.java:230)
>>         at
>> org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:595)
>>         at
>> org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:406)
>>         at
>> org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:290)
>>         at
>> org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:682)
>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>         at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>         at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>         at java.lang.reflect.Method.invoke(Method.java:597)
>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
>>
>>         at
>> org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:631)
>>         at
>> org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:406)
>>         at
>> org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:290)
>>         at
>> org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:682)
>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>         at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>         at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>         at java.lang.reflect.Method.invoke(Method.java:597)
>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
>> Execution failed with exit status: 2
>> Obtaining error information
>>
>> Task failed!
>>
>>
>>
>> I will need to run this from Oozie eventually, so I'd like to know how
>> get Hive to use a jar in HDFS, rather than have to distribute the file to
>> the local file system of all datanodes.
>>
>> Thank you,
>> Michael
>>
>
>

Re: How to specify Hive auxiliary jar in HDFS, not local file system

Posted by Adam Kawa <ka...@gmail.com>.

You can use ADD JAR command inside a Hive script and a parameter in Oozie
workflow definition. Example is here:
http://blog.cloudera.com/blog/2013/01/how-to-schedule-recurring-hadoop-jobs-with-apache-oozie/


2013/12/2 <mp...@gmail.com>

> Is it possible to specify a Hive auxiliary jar (like a SerDe) that is in
> HDFS rather than the local fileystem?
>
> I am using a CsvSerDe I wrote and when I specify it Hive
> hive.aux.jars.path with a local file system path it works:
>
> hive -hiveconf hive.aux.jars.path=*file:*///path/to/truven-hive-serdes-1.0.jar
> -hiveconf hive.auto.convert.join.noconditionaltask.size=25000000 -f
> hivefiscalyearqueries.sql
>
> But when I put that jar in HDFS and point it to, it fails:
>
> hive -hiveconf hive.aux.jars.path=*hdfs:*///hdfspath/to/truven-hive-serdes-1.0.jar
> -hiveconf hive.auto.convert.join.noconditionaltask.size=25000000 -f
> hivefiscalyearqueries.sql
>
> with the error message:
>
> java.lang.ClassNotFoundException: com.truven.hiveserde.csv.CsvSerDe
> Continuing ...
> 2013-12-02 03:48:25     Starting to launch local task to process map join;
>      maximum memory = 1065484288
> org.apache.hadoop.hive.ql.metadata.HiveException: Failed with exception
> nulljava.lang.NullPointerException
>         at
> org.apache.hadoop.hive.ql.exec.FetchOperator.getRowInspectorFromTable(FetchOperator.java:230)
>         at
> org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:595)
>         at
> org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:406)
>         at
> org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:290)
>         at
> org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:682)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
>
>         at
> org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:631)
>         at
> org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:406)
>         at
> org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:290)
>         at
> org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:682)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
> Execution failed with exit status: 2
> Obtaining error information
>
> Task failed!
>
>
>
> I will need to run this from Oozie eventually, so I'd like to know how get
> Hive to use a jar in HDFS, rather than have to distribute the file to the
> local file system of all datanodes.
>
> Thank you,
> Michael
>