You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Ping Zhu <pi...@sharethis.com> on 2010/07/22 00:54:07 UTC
deploy simple UDF function
Hi,
I have a problem with calling a simple UDF function in Hive query. I
compiled the function and created a jar file on my local pc. Then the jar
file is sent to a remote Hive cluster and deployed. When this UDF function
is called in a Hive query, an error "FAILED: Unknown exception: null"
returns. I checked Hive log file, the detailed error message is:
2010-07-21 15:45:33,590 ERROR ql.Driver (SessionState.java:printError(248))
- FAILED: Unknown exception: null
java.lang.NullPointerException
at
org.apache.hadoop.hive.ql.udf.generic.GenericUDFUtils$ConversionHelper.<init>(GenericUDFUtils.java:214)
at
org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.initialize(GenericUDFBridge.java:140)
at
org.apache.hadoop.hive.ql.plan.exprNodeGenericFuncDesc.newInstance(exprNodeGenericFuncDesc.java:141)
at
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getFuncExprNodeDesc(TypeCheckProcFactory.java:444)
at
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:541)
at
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:634)
at
org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:80)
at
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:83)
at
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:117)
at
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:95)
at
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:5283)
at
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:1005)
at
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:991)
at
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:4234)
at
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:4714)
at
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5203)
at
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275)
at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312)
at
org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123)
at
org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
The versions of Hive installed on my local pc and remote Hive cluster are
0.6 and 0.5 respectively. I copied corresponding jar files which are needed
to compile the UDF function from remote Hive cluster, but it still does not
work.
Any suggestions/comments will be highly appreciated.
Thanks and best regards,
Ping
Re: deploy simple UDF function
Posted by Ping Zhu <pi...@sharethis.com>.
There was a typo in my previous email. The where clause in Hive query which
ran into exception was "where f(col) = true"
On Wed, Jul 21, 2010 at 4:57 PM, Ping Zhu <pi...@sharethis.com> wrote:
> I figured the source of error: The UDF function (say, f) returns boolean
> value. The where clause in Hive query was "where f(col) is true)". I change
> the where clause to "where f(col)". Then it works.
>
> I did other contrived test by changing the return type of UDF to int. The
> where clause in Hive query is changed to "where f(col)=1". It also works.
>
> Is this an issue/bug of Hive compiler?
>
> Ping
>
>
> On Wed, Jul 21, 2010 at 3:55 PM, Ping Zhu <pi...@sharethis.com> wrote:
>
>> I have tested this simple UDF function locally. The function itself is
>> properly implemented.
>>
>>
>> On Wed, Jul 21, 2010 at 3:54 PM, Ping Zhu <pi...@sharethis.com> wrote:
>>
>>> Hi,
>>>
>>> I have a problem with calling a simple UDF function in Hive query. I
>>> compiled the function and created a jar file on my local pc. Then the jar
>>> file is sent to a remote Hive cluster and deployed. When this UDF function
>>> is called in a Hive query, an error "FAILED: Unknown exception: null"
>>> returns. I checked Hive log file, the detailed error message is:
>>>
>>> 2010-07-21 15:45:33,590 ERROR ql.Driver
>>> (SessionState.java:printError(248)) - FAILED: Unknown exception: null
>>> java.lang.NullPointerException
>>> at
>>> org.apache.hadoop.hive.ql.udf.generic.GenericUDFUtils$ConversionHelper.<init>(GenericUDFUtils.java:214)
>>> at
>>> org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.initialize(GenericUDFBridge.java:140)
>>> at
>>> org.apache.hadoop.hive.ql.plan.exprNodeGenericFuncDesc.newInstance(exprNodeGenericFuncDesc.java:141)
>>> at
>>> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getFuncExprNodeDesc(TypeCheckProcFactory.java:444)
>>> at
>>> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:541)
>>> at
>>> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:634)
>>> at
>>> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:80)
>>> at
>>> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:83)
>>> at
>>> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:117)
>>> at
>>> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:95)
>>> at
>>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:5283)
>>> at
>>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:1005)
>>> at
>>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:991)
>>> at
>>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:4234)
>>> at
>>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:4714)
>>> at
>>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5203)
>>> at
>>> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105)
>>> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275)
>>> at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320)
>>> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312)
>>> at
>>> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123)
>>> at
>>> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181)
>>> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>> at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>> at java.lang.reflect.Method.invoke(Method.java:597)
>>> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>>>
>>>
>>> The versions of Hive installed on my local pc and remote Hive cluster
>>> are 0.6 and 0.5 respectively. I copied corresponding jar files which are
>>> needed to compile the UDF function from remote Hive cluster, but it still
>>> does not work.
>>>
>>> Any suggestions/comments will be highly appreciated.
>>>
>>> Thanks and best regards,
>>>
>>> Ping
>>>
>>
>>
>
Re: deploy simple UDF function
Posted by Ping Zhu <pi...@sharethis.com>.
The version of Hive I am using is 0.50
On Thu, Jul 22, 2010 at 11:53 AM, Paul Yang <py...@facebook.com> wrote:
> Hey Ping,
>
>
>
> I just tried the same UDF/query but I am unable to reproduce that NPE.
> Which version of hive are you using?
>
>
>
> Cheers,
>
> Paul
>
>
>
> *From:* Ping Zhu [mailto:ping@sharethis.com]
> *Sent:* Wednesday, July 21, 2010 5:48 PM
> *To:* hive-user@hadoop.apache.org
> *Subject:* Re: deploy simple UDF function
>
>
>
> This problem still exist. My small test case is:
>
>
>
> I created a table string_table with one column of string type. I insert one
> record into table string_table. I create another UDF function "udftest"
> which takes Text argument and return boolean value. The query is "select *
> from string_table where udftest(col) = true;". Error "FAILED: Unknown
> exception: null" returns.
>
>
>
> UDF function source code:
>
>
>
> package com.example;
>
>
>
> import org.apache.hadoop.hive.ql.exec.UDF;
>
> import org.apache.hadoop.io.Text;
>
>
>
> public final class UDFTest extends UDF {
>
>
>
> public boolean evaluate(final Text s) {
>
> if (s == null) {
>
> return false;
>
> }
>
> return true;
>
> }
>
> }
>
>
>
> On Wed, Jul 21, 2010 at 5:20 PM, Paul Yang <py...@facebook.com> wrote:
>
> I did notice that if the where clause is not a Boolean expression, there is
> a exception thrown – e.g. SELECT key FORM src WHERE 1; I filed a JIRA for
> this issue at:
>
>
>
> https://issues.apache.org/jira/browse/HIVE-1478
>
>
>
> Glad that your query works now, but “where f(col) = true” should not cause
> an error, as the = operator returns a boolean value. So it’s strange that
> you got an error… if you see this problem again, could you post a test case?
>
>
>
> *From:* Ping Zhu [mailto:ping@sharethis.com]
> *Sent:* Wednesday, July 21, 2010 5:10 PM
> *To:* hive-user@hadoop.apache.org
> *Subject:* Re: deploy simple UDF function
>
>
>
> It was "where f(col) = true". Sorry for the typo.
>
>
>
> Ping
>
> On Wed, Jul 21, 2010 at 5:03 PM, Paul Yang <py...@facebook.com> wrote:
>
> Hold on, how did ‘where f(col) is true’ compile? I don’t think “is true” is
> valid HQL. Can you post the full query?
>
>
>
> *From:* Ping Zhu [mailto:ping@sharethis.com]
> *Sent:* Wednesday, July 21, 2010 4:58 PM
> *To:* hive-user@hadoop.apache.org
> *Subject:* Re: deploy simple UDF function
>
>
>
> I figured the source of error: The UDF function (say, f) returns boolean
> value. The where clause in Hive query was "where f(col) is true)". I change
> the where clause to "where f(col)". Then it works.
>
>
>
> I did other contrived test by changing the return type of UDF to int. The
> where clause in Hive query is changed to "where f(col)=1". It also works.
>
>
>
> Is this an issue/bug of Hive compiler?
>
>
>
> Ping
>
> On Wed, Jul 21, 2010 at 3:55 PM, Ping Zhu <pi...@sharethis.com> wrote:
>
> I have tested this simple UDF function locally. The function itself is
> properly implemented.
>
>
>
> On Wed, Jul 21, 2010 at 3:54 PM, Ping Zhu <pi...@sharethis.com> wrote:
>
> Hi,
>
>
>
> I have a problem with calling a simple UDF function in Hive query. I
> compiled the function and created a jar file on my local pc. Then the jar
> file is sent to a remote Hive cluster and deployed. When this UDF function
> is called in a Hive query, an error "FAILED: Unknown exception: null"
> returns. I checked Hive log file, the detailed error message is:
>
>
>
> 2010-07-21 15:45:33,590 ERROR ql.Driver (SessionState.java:printError(248))
> - FAILED: Unknown exception: null
>
> java.lang.NullPointerException
>
> at
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFUtils$ConversionHelper.<init>(GenericUDFUtils.java:214)
>
> at
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.initialize(GenericUDFBridge.java:140)
>
> at
> org.apache.hadoop.hive.ql.plan.exprNodeGenericFuncDesc.newInstance(exprNodeGenericFuncDesc.java:141)
>
> at
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getFuncExprNodeDesc(TypeCheckProcFactory.java:444)
>
> at
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:541)
>
> at
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:634)
>
> at
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:80)
>
> at
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:83)
>
> at
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:117)
>
> at
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:95)
>
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:5283)
>
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:1005)
>
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:991)
>
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:4234)
>
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:4714)
>
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5203)
>
> at
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105)
>
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275)
>
> at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320)
>
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312)
>
> at
> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123)
>
> at
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181)
>
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>
> at java.lang.reflect.Method.invoke(Method.java:597)
>
> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>
>
>
>
>
> The versions of Hive installed on my local pc and remote Hive cluster are
> 0.6 and 0.5 respectively. I copied corresponding jar files which are needed
> to compile the UDF function from remote Hive cluster, but it still does not
> work.
>
>
>
> Any suggestions/comments will be highly appreciated.
>
>
>
> Thanks and best regards,
>
>
>
> Ping
>
>
>
>
>
>
>
>
>
RE: deploy simple UDF function
Posted by Paul Yang <py...@facebook.com>.
Hey Ping,
I just tried the same UDF/query but I am unable to reproduce that NPE. Which version of hive are you using?
Cheers,
Paul
From: Ping Zhu [mailto:ping@sharethis.com]
Sent: Wednesday, July 21, 2010 5:48 PM
To: hive-user@hadoop.apache.org
Subject: Re: deploy simple UDF function
This problem still exist. My small test case is:
I created a table string_table with one column of string type. I insert one record into table string_table. I create another UDF function "udftest" which takes Text argument and return boolean value. The query is "select * from string_table where udftest(col) = true;". Error "FAILED: Unknown exception: null" returns.
UDF function source code:
package com.example;
import org.apache.hadoop.hive.ql.exec.UDF;
import org.apache.hadoop.io.Text;
public final class UDFTest extends UDF {
public boolean evaluate(final Text s) {
if (s == null) {
return false;
}
return true;
}
}
On Wed, Jul 21, 2010 at 5:20 PM, Paul Yang <py...@facebook.com>> wrote:
I did notice that if the where clause is not a Boolean expression, there is a exception thrown - e.g. SELECT key FORM src WHERE 1; I filed a JIRA for this issue at:
https://issues.apache.org/jira/browse/HIVE-1478
Glad that your query works now, but "where f(col) = true" should not cause an error, as the = operator returns a boolean value. So it's strange that you got an error... if you see this problem again, could you post a test case?
From: Ping Zhu [mailto:ping@sharethis.com<ma...@sharethis.com>]
Sent: Wednesday, July 21, 2010 5:10 PM
To: hive-user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: deploy simple UDF function
It was "where f(col) = true". Sorry for the typo.
Ping
On Wed, Jul 21, 2010 at 5:03 PM, Paul Yang <py...@facebook.com>> wrote:
Hold on, how did 'where f(col) is true' compile? I don't think "is true" is valid HQL. Can you post the full query?
From: Ping Zhu [mailto:ping@sharethis.com<ma...@sharethis.com>]
Sent: Wednesday, July 21, 2010 4:58 PM
To: hive-user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: deploy simple UDF function
I figured the source of error: The UDF function (say, f) returns boolean value. The where clause in Hive query was "where f(col) is true)". I change the where clause to "where f(col)". Then it works.
I did other contrived test by changing the return type of UDF to int. The where clause in Hive query is changed to "where f(col)=1". It also works.
Is this an issue/bug of Hive compiler?
Ping
On Wed, Jul 21, 2010 at 3:55 PM, Ping Zhu <pi...@sharethis.com>> wrote:
I have tested this simple UDF function locally. The function itself is properly implemented.
On Wed, Jul 21, 2010 at 3:54 PM, Ping Zhu <pi...@sharethis.com>> wrote:
Hi,
I have a problem with calling a simple UDF function in Hive query. I compiled the function and created a jar file on my local pc. Then the jar file is sent to a remote Hive cluster and deployed. When this UDF function is called in a Hive query, an error "FAILED: Unknown exception: null" returns. I checked Hive log file, the detailed error message is:
2010-07-21 15:45:33,590 ERROR ql.Driver (SessionState.java:printError(248)) - FAILED: Unknown exception: null
java.lang.NullPointerException
at org.apache.hadoop.hive.ql.udf.generic.GenericUDFUtils$ConversionHelper.<init>(GenericUDFUtils.java:214)
at org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.initialize(GenericUDFBridge.java:140)
at org.apache.hadoop.hive.ql.plan.exprNodeGenericFuncDesc.newInstance(exprNodeGenericFuncDesc.java:141)
at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getFuncExprNodeDesc(TypeCheckProcFactory.java:444)
at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:541)
at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:634)
at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:80)
at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:83)
at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:117)
at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:95)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:5283)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:1005)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:991)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:4234)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:4714)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5203)
at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275)
at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
The versions of Hive installed on my local pc and remote Hive cluster are 0.6 and 0.5 respectively. I copied corresponding jar files which are needed to compile the UDF function from remote Hive cluster, but it still does not work.
Any suggestions/comments will be highly appreciated.
Thanks and best regards,
Ping
Re: deploy simple UDF function
Posted by Ping Zhu <pi...@sharethis.com>.
This problem still exist. My small test case is:
I created a table string_table with one column of string type. I insert one
record into table string_table. I create another UDF function "udftest"
which takes Text argument and return boolean value. The query is "select *
from string_table where udftest(col) = true;". Error "FAILED: Unknown
exception: null" returns.
UDF function source code:
package com.example;
import org.apache.hadoop.hive.ql.exec.UDF;
import org.apache.hadoop.io.Text;
public final class UDFTest extends UDF {
public boolean evaluate(final Text s) {
if (s == null) {
return false;
}
return true;
}
}
On Wed, Jul 21, 2010 at 5:20 PM, Paul Yang <py...@facebook.com> wrote:
> I did notice that if the where clause is not a Boolean expression, there
> is a exception thrown – e.g. SELECT key FORM src WHERE 1; I filed a JIRA for
> this issue at:
>
>
>
> https://issues.apache.org/jira/browse/HIVE-1478
>
>
>
> Glad that your query works now, but “where f(col) = true” should not cause
> an error, as the = operator returns a boolean value. So it’s strange that
> you got an error… if you see this problem again, could you post a test case?
>
>
>
> *From:* Ping Zhu [mailto:ping@sharethis.com]
> *Sent:* Wednesday, July 21, 2010 5:10 PM
> *To:* hive-user@hadoop.apache.org
> *Subject:* Re: deploy simple UDF function
>
>
>
> It was "where f(col) = true". Sorry for the typo.
>
>
>
> Ping
>
> On Wed, Jul 21, 2010 at 5:03 PM, Paul Yang <py...@facebook.com> wrote:
>
> Hold on, how did ‘where f(col) is true’ compile? I don’t think “is true” is
> valid HQL. Can you post the full query?
>
>
>
> *From:* Ping Zhu [mailto:ping@sharethis.com]
> *Sent:* Wednesday, July 21, 2010 4:58 PM
> *To:* hive-user@hadoop.apache.org
> *Subject:* Re: deploy simple UDF function
>
>
>
> I figured the source of error: The UDF function (say, f) returns boolean
> value. The where clause in Hive query was "where f(col) is true)". I change
> the where clause to "where f(col)". Then it works.
>
>
>
> I did other contrived test by changing the return type of UDF to int. The
> where clause in Hive query is changed to "where f(col)=1". It also works.
>
>
>
> Is this an issue/bug of Hive compiler?
>
>
>
> Ping
>
> On Wed, Jul 21, 2010 at 3:55 PM, Ping Zhu <pi...@sharethis.com> wrote:
>
> I have tested this simple UDF function locally. The function itself is
> properly implemented.
>
>
>
> On Wed, Jul 21, 2010 at 3:54 PM, Ping Zhu <pi...@sharethis.com> wrote:
>
> Hi,
>
>
>
> I have a problem with calling a simple UDF function in Hive query. I
> compiled the function and created a jar file on my local pc. Then the jar
> file is sent to a remote Hive cluster and deployed. When this UDF function
> is called in a Hive query, an error "FAILED: Unknown exception: null"
> returns. I checked Hive log file, the detailed error message is:
>
>
>
> 2010-07-21 15:45:33,590 ERROR ql.Driver (SessionState.java:printError(248))
> - FAILED: Unknown exception: null
>
> java.lang.NullPointerException
>
> at
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFUtils$ConversionHelper.<init>(GenericUDFUtils.java:214)
>
> at
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.initialize(GenericUDFBridge.java:140)
>
> at
> org.apache.hadoop.hive.ql.plan.exprNodeGenericFuncDesc.newInstance(exprNodeGenericFuncDesc.java:141)
>
> at
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getFuncExprNodeDesc(TypeCheckProcFactory.java:444)
>
> at
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:541)
>
> at
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:634)
>
> at
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:80)
>
> at
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:83)
>
> at
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:117)
>
> at
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:95)
>
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:5283)
>
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:1005)
>
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:991)
>
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:4234)
>
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:4714)
>
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5203)
>
> at
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105)
>
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275)
>
> at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320)
>
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312)
>
> at
> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123)
>
> at
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181)
>
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>
> at java.lang.reflect.Method.invoke(Method.java:597)
>
> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>
>
>
>
>
> The versions of Hive installed on my local pc and remote Hive cluster are
> 0.6 and 0.5 respectively. I copied corresponding jar files which are needed
> to compile the UDF function from remote Hive cluster, but it still does not
> work.
>
>
>
> Any suggestions/comments will be highly appreciated.
>
>
>
> Thanks and best regards,
>
>
>
> Ping
>
>
>
>
>
>
>
RE: deploy simple UDF function
Posted by Paul Yang <py...@facebook.com>.
I did notice that if the where clause is not a Boolean expression, there is a exception thrown - e.g. SELECT key FORM src WHERE 1; I filed a JIRA for this issue at:
https://issues.apache.org/jira/browse/HIVE-1478
Glad that your query works now, but "where f(col) = true" should not cause an error, as the = operator returns a boolean value. So it's strange that you got an error... if you see this problem again, could you post a test case?
From: Ping Zhu [mailto:ping@sharethis.com]
Sent: Wednesday, July 21, 2010 5:10 PM
To: hive-user@hadoop.apache.org
Subject: Re: deploy simple UDF function
It was "where f(col) = true". Sorry for the typo.
Ping
On Wed, Jul 21, 2010 at 5:03 PM, Paul Yang <py...@facebook.com>> wrote:
Hold on, how did 'where f(col) is true' compile? I don't think "is true" is valid HQL. Can you post the full query?
From: Ping Zhu [mailto:ping@sharethis.com<ma...@sharethis.com>]
Sent: Wednesday, July 21, 2010 4:58 PM
To: hive-user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: deploy simple UDF function
I figured the source of error: The UDF function (say, f) returns boolean value. The where clause in Hive query was "where f(col) is true)". I change the where clause to "where f(col)". Then it works.
I did other contrived test by changing the return type of UDF to int. The where clause in Hive query is changed to "where f(col)=1". It also works.
Is this an issue/bug of Hive compiler?
Ping
On Wed, Jul 21, 2010 at 3:55 PM, Ping Zhu <pi...@sharethis.com>> wrote:
I have tested this simple UDF function locally. The function itself is properly implemented.
On Wed, Jul 21, 2010 at 3:54 PM, Ping Zhu <pi...@sharethis.com>> wrote:
Hi,
I have a problem with calling a simple UDF function in Hive query. I compiled the function and created a jar file on my local pc. Then the jar file is sent to a remote Hive cluster and deployed. When this UDF function is called in a Hive query, an error "FAILED: Unknown exception: null" returns. I checked Hive log file, the detailed error message is:
2010-07-21 15:45:33,590 ERROR ql.Driver (SessionState.java:printError(248)) - FAILED: Unknown exception: null
java.lang.NullPointerException
at org.apache.hadoop.hive.ql.udf.generic.GenericUDFUtils$ConversionHelper.<init>(GenericUDFUtils.java:214)
at org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.initialize(GenericUDFBridge.java:140)
at org.apache.hadoop.hive.ql.plan.exprNodeGenericFuncDesc.newInstance(exprNodeGenericFuncDesc.java:141)
at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getFuncExprNodeDesc(TypeCheckProcFactory.java:444)
at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:541)
at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:634)
at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:80)
at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:83)
at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:117)
at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:95)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:5283)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:1005)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:991)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:4234)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:4714)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5203)
at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275)
at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
The versions of Hive installed on my local pc and remote Hive cluster are 0.6 and 0.5 respectively. I copied corresponding jar files which are needed to compile the UDF function from remote Hive cluster, but it still does not work.
Any suggestions/comments will be highly appreciated.
Thanks and best regards,
Ping
Re: deploy simple UDF function
Posted by Ping Zhu <pi...@sharethis.com>.
It was "where f(col) = true". Sorry for the typo.
Ping
On Wed, Jul 21, 2010 at 5:03 PM, Paul Yang <py...@facebook.com> wrote:
> Hold on, how did ‘where f(col) is true’ compile? I don’t think “is true”
> is valid HQL. Can you post the full query?
>
>
>
> *From:* Ping Zhu [mailto:ping@sharethis.com]
> *Sent:* Wednesday, July 21, 2010 4:58 PM
> *To:* hive-user@hadoop.apache.org
> *Subject:* Re: deploy simple UDF function
>
>
>
> I figured the source of error: The UDF function (say, f) returns boolean
> value. The where clause in Hive query was "where f(col) is true)". I change
> the where clause to "where f(col)". Then it works.
>
>
>
> I did other contrived test by changing the return type of UDF to int. The
> where clause in Hive query is changed to "where f(col)=1". It also works.
>
>
>
> Is this an issue/bug of Hive compiler?
>
>
>
> Ping
>
> On Wed, Jul 21, 2010 at 3:55 PM, Ping Zhu <pi...@sharethis.com> wrote:
>
> I have tested this simple UDF function locally. The function itself is
> properly implemented.
>
>
>
> On Wed, Jul 21, 2010 at 3:54 PM, Ping Zhu <pi...@sharethis.com> wrote:
>
> Hi,
>
>
>
> I have a problem with calling a simple UDF function in Hive query. I
> compiled the function and created a jar file on my local pc. Then the jar
> file is sent to a remote Hive cluster and deployed. When this UDF function
> is called in a Hive query, an error "FAILED: Unknown exception: null"
> returns. I checked Hive log file, the detailed error message is:
>
>
>
> 2010-07-21 15:45:33,590 ERROR ql.Driver (SessionState.java:printError(248))
> - FAILED: Unknown exception: null
>
> java.lang.NullPointerException
>
> at
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFUtils$ConversionHelper.<init>(GenericUDFUtils.java:214)
>
> at
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.initialize(GenericUDFBridge.java:140)
>
> at
> org.apache.hadoop.hive.ql.plan.exprNodeGenericFuncDesc.newInstance(exprNodeGenericFuncDesc.java:141)
>
> at
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getFuncExprNodeDesc(TypeCheckProcFactory.java:444)
>
> at
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:541)
>
> at
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:634)
>
> at
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:80)
>
> at
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:83)
>
> at
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:117)
>
> at
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:95)
>
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:5283)
>
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:1005)
>
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:991)
>
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:4234)
>
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:4714)
>
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5203)
>
> at
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105)
>
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275)
>
> at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320)
>
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312)
>
> at
> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123)
>
> at
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181)
>
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>
> at java.lang.reflect.Method.invoke(Method.java:597)
>
> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>
>
>
>
>
> The versions of Hive installed on my local pc and remote Hive cluster are
> 0.6 and 0.5 respectively. I copied corresponding jar files which are needed
> to compile the UDF function from remote Hive cluster, but it still does not
> work.
>
>
>
> Any suggestions/comments will be highly appreciated.
>
>
>
> Thanks and best regards,
>
>
>
> Ping
>
>
>
>
>
RE: deploy simple UDF function
Posted by Paul Yang <py...@facebook.com>.
Hold on, how did 'where f(col) is true' compile? I don't think "is true" is valid HQL. Can you post the full query?
From: Ping Zhu [mailto:ping@sharethis.com]
Sent: Wednesday, July 21, 2010 4:58 PM
To: hive-user@hadoop.apache.org
Subject: Re: deploy simple UDF function
I figured the source of error: The UDF function (say, f) returns boolean value. The where clause in Hive query was "where f(col) is true)". I change the where clause to "where f(col)". Then it works.
I did other contrived test by changing the return type of UDF to int. The where clause in Hive query is changed to "where f(col)=1". It also works.
Is this an issue/bug of Hive compiler?
Ping
On Wed, Jul 21, 2010 at 3:55 PM, Ping Zhu <pi...@sharethis.com>> wrote:
I have tested this simple UDF function locally. The function itself is properly implemented.
On Wed, Jul 21, 2010 at 3:54 PM, Ping Zhu <pi...@sharethis.com>> wrote:
Hi,
I have a problem with calling a simple UDF function in Hive query. I compiled the function and created a jar file on my local pc. Then the jar file is sent to a remote Hive cluster and deployed. When this UDF function is called in a Hive query, an error "FAILED: Unknown exception: null" returns. I checked Hive log file, the detailed error message is:
2010-07-21 15:45:33,590 ERROR ql.Driver (SessionState.java:printError(248)) - FAILED: Unknown exception: null
java.lang.NullPointerException
at org.apache.hadoop.hive.ql.udf.generic.GenericUDFUtils$ConversionHelper.<init>(GenericUDFUtils.java:214)
at org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.initialize(GenericUDFBridge.java:140)
at org.apache.hadoop.hive.ql.plan.exprNodeGenericFuncDesc.newInstance(exprNodeGenericFuncDesc.java:141)
at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getFuncExprNodeDesc(TypeCheckProcFactory.java:444)
at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:541)
at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:634)
at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:80)
at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:83)
at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:117)
at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:95)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:5283)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:1005)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:991)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:4234)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:4714)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5203)
at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275)
at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
The versions of Hive installed on my local pc and remote Hive cluster are 0.6 and 0.5 respectively. I copied corresponding jar files which are needed to compile the UDF function from remote Hive cluster, but it still does not work.
Any suggestions/comments will be highly appreciated.
Thanks and best regards,
Ping
Re: deploy simple UDF function
Posted by Ping Zhu <pi...@sharethis.com>.
I figured the source of error: The UDF function (say, f) returns boolean
value. The where clause in Hive query was "where f(col) is true)". I change
the where clause to "where f(col)". Then it works.
I did other contrived test by changing the return type of UDF to int. The
where clause in Hive query is changed to "where f(col)=1". It also works.
Is this an issue/bug of Hive compiler?
Ping
On Wed, Jul 21, 2010 at 3:55 PM, Ping Zhu <pi...@sharethis.com> wrote:
> I have tested this simple UDF function locally. The function itself is
> properly implemented.
>
>
> On Wed, Jul 21, 2010 at 3:54 PM, Ping Zhu <pi...@sharethis.com> wrote:
>
>> Hi,
>>
>> I have a problem with calling a simple UDF function in Hive query. I
>> compiled the function and created a jar file on my local pc. Then the jar
>> file is sent to a remote Hive cluster and deployed. When this UDF function
>> is called in a Hive query, an error "FAILED: Unknown exception: null"
>> returns. I checked Hive log file, the detailed error message is:
>>
>> 2010-07-21 15:45:33,590 ERROR ql.Driver
>> (SessionState.java:printError(248)) - FAILED: Unknown exception: null
>> java.lang.NullPointerException
>> at
>> org.apache.hadoop.hive.ql.udf.generic.GenericUDFUtils$ConversionHelper.<init>(GenericUDFUtils.java:214)
>> at
>> org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.initialize(GenericUDFBridge.java:140)
>> at
>> org.apache.hadoop.hive.ql.plan.exprNodeGenericFuncDesc.newInstance(exprNodeGenericFuncDesc.java:141)
>> at
>> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getFuncExprNodeDesc(TypeCheckProcFactory.java:444)
>> at
>> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:541)
>> at
>> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:634)
>> at
>> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:80)
>> at
>> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:83)
>> at
>> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:117)
>> at
>> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:95)
>> at
>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:5283)
>> at
>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:1005)
>> at
>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:991)
>> at
>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:4234)
>> at
>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:4714)
>> at
>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5203)
>> at
>> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105)
>> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275)
>> at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320)
>> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312)
>> at
>> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123)
>> at
>> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181)
>> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> at java.lang.reflect.Method.invoke(Method.java:597)
>> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>>
>>
>> The versions of Hive installed on my local pc and remote Hive cluster
>> are 0.6 and 0.5 respectively. I copied corresponding jar files which are
>> needed to compile the UDF function from remote Hive cluster, but it still
>> does not work.
>>
>> Any suggestions/comments will be highly appreciated.
>>
>> Thanks and best regards,
>>
>> Ping
>>
>
>
Re: deploy simple UDF function
Posted by Ping Zhu <pi...@sharethis.com>.
I have tested this simple UDF function locally. The function itself is
properly implemented.
On Wed, Jul 21, 2010 at 3:54 PM, Ping Zhu <pi...@sharethis.com> wrote:
> Hi,
>
> I have a problem with calling a simple UDF function in Hive query. I
> compiled the function and created a jar file on my local pc. Then the jar
> file is sent to a remote Hive cluster and deployed. When this UDF function
> is called in a Hive query, an error "FAILED: Unknown exception: null"
> returns. I checked Hive log file, the detailed error message is:
>
> 2010-07-21 15:45:33,590 ERROR ql.Driver (SessionState.java:printError(248))
> - FAILED: Unknown exception: null
> java.lang.NullPointerException
> at
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFUtils$ConversionHelper.<init>(GenericUDFUtils.java:214)
> at
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.initialize(GenericUDFBridge.java:140)
> at
> org.apache.hadoop.hive.ql.plan.exprNodeGenericFuncDesc.newInstance(exprNodeGenericFuncDesc.java:141)
> at
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getFuncExprNodeDesc(TypeCheckProcFactory.java:444)
> at
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:541)
> at
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:634)
> at
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:80)
> at
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:83)
> at
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:117)
> at
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:95)
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:5283)
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:1005)
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:991)
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:4234)
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:4714)
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5203)
> at
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275)
> at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312)
> at
> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123)
> at
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>
>
> The versions of Hive installed on my local pc and remote Hive cluster are
> 0.6 and 0.5 respectively. I copied corresponding jar files which are needed
> to compile the UDF function from remote Hive cluster, but it still does not
> work.
>
> Any suggestions/comments will be highly appreciated.
>
> Thanks and best regards,
>
> Ping
>