You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Ping Zhu <pi...@sharethis.com> on 2010/07/22 00:54:07 UTC

deploy simple UDF function

Hi,

  I have a problem with calling a simple UDF function in Hive query. I
compiled the function and created a jar file on my local pc. Then the jar
file is sent to a remote Hive cluster and deployed. When this UDF function
is called in a Hive query, an error "FAILED: Unknown exception: null"
returns. I checked Hive log file, the detailed error message is:

2010-07-21 15:45:33,590 ERROR ql.Driver (SessionState.java:printError(248))
- FAILED: Unknown exception: null
java.lang.NullPointerException
        at
org.apache.hadoop.hive.ql.udf.generic.GenericUDFUtils$ConversionHelper.<init>(GenericUDFUtils.java:214)
        at
org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.initialize(GenericUDFBridge.java:140)
        at
org.apache.hadoop.hive.ql.plan.exprNodeGenericFuncDesc.newInstance(exprNodeGenericFuncDesc.java:141)
        at
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getFuncExprNodeDesc(TypeCheckProcFactory.java:444)
        at
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:541)
        at
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:634)
        at
org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:80)
        at
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:83)
        at
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:117)
        at
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:95)
        at
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:5283)
        at
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:1005)
        at
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:991)
        at
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:4234)
        at
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:4714)
        at
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5203)
        at
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105)
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275)
        at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312)
        at
org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123)
        at
org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)


  The versions of Hive installed on my local pc and remote Hive cluster are
0.6 and 0.5 respectively. I copied corresponding jar files which are needed
to compile the UDF function from remote Hive cluster, but it still does not
work.

  Any suggestions/comments will be highly appreciated.

  Thanks and best regards,

  Ping

Re: deploy simple UDF function

Posted by Ping Zhu <pi...@sharethis.com>.
There was a typo in my previous email. The where clause in Hive query which
ran into exception was "where f(col) = true"

On Wed, Jul 21, 2010 at 4:57 PM, Ping Zhu <pi...@sharethis.com> wrote:

> I figured the source of error: The UDF function (say, f) returns boolean
> value. The where clause in Hive query was "where f(col) is true)". I change
> the where clause to "where f(col)". Then it works.
>
> I did other contrived test by changing the return type of UDF to int. The
> where clause in Hive query is changed to "where f(col)=1". It also works.
>
> Is this an issue/bug of Hive compiler?
>
> Ping
>
>
> On Wed, Jul 21, 2010 at 3:55 PM, Ping Zhu <pi...@sharethis.com> wrote:
>
>> I have tested this simple UDF function locally. The function itself is
>> properly implemented.
>>
>>
>> On Wed, Jul 21, 2010 at 3:54 PM, Ping Zhu <pi...@sharethis.com> wrote:
>>
>>> Hi,
>>>
>>>   I have a problem with calling a simple UDF function in Hive query. I
>>> compiled the function and created a jar file on my local pc. Then the jar
>>> file is sent to a remote Hive cluster and deployed. When this UDF function
>>> is called in a Hive query, an error "FAILED: Unknown exception: null"
>>> returns. I checked Hive log file, the detailed error message is:
>>>
>>> 2010-07-21 15:45:33,590 ERROR ql.Driver
>>> (SessionState.java:printError(248)) - FAILED: Unknown exception: null
>>> java.lang.NullPointerException
>>>         at
>>> org.apache.hadoop.hive.ql.udf.generic.GenericUDFUtils$ConversionHelper.<init>(GenericUDFUtils.java:214)
>>>         at
>>> org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.initialize(GenericUDFBridge.java:140)
>>>         at
>>> org.apache.hadoop.hive.ql.plan.exprNodeGenericFuncDesc.newInstance(exprNodeGenericFuncDesc.java:141)
>>>         at
>>> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getFuncExprNodeDesc(TypeCheckProcFactory.java:444)
>>>         at
>>> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:541)
>>>         at
>>> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:634)
>>>         at
>>> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:80)
>>>         at
>>> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:83)
>>>         at
>>> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:117)
>>>         at
>>> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:95)
>>>         at
>>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:5283)
>>>         at
>>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:1005)
>>>         at
>>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:991)
>>>         at
>>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:4234)
>>>         at
>>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:4714)
>>>         at
>>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5203)
>>>          at
>>> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105)
>>>         at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275)
>>>         at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320)
>>>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312)
>>>         at
>>> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123)
>>>         at
>>> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181)
>>>         at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287)
>>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>         at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>         at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>         at java.lang.reflect.Method.invoke(Method.java:597)
>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>>>
>>>
>>>   The versions of Hive installed on my local pc and remote Hive cluster
>>> are 0.6 and 0.5 respectively. I copied corresponding jar files which are
>>> needed to compile the UDF function from remote Hive cluster, but it still
>>> does not work.
>>>
>>>   Any suggestions/comments will be highly appreciated.
>>>
>>>   Thanks and best regards,
>>>
>>>   Ping
>>>
>>
>>
>

Re: deploy simple UDF function

Posted by Ping Zhu <pi...@sharethis.com>.
The version of Hive I am using is 0.50

On Thu, Jul 22, 2010 at 11:53 AM, Paul Yang <py...@facebook.com> wrote:

>  Hey Ping,
>
>
>
> I just tried the same UDF/query but I am unable to reproduce that NPE.
> Which version of hive are you using?
>
>
>
> Cheers,
>
> Paul
>
>
>
> *From:* Ping Zhu [mailto:ping@sharethis.com]
> *Sent:* Wednesday, July 21, 2010 5:48 PM
> *To:* hive-user@hadoop.apache.org
> *Subject:* Re: deploy simple UDF function
>
>
>
> This problem still exist. My small test case is:
>
>
>
> I created a table string_table with one column of string type. I insert one
> record into table string_table. I create another UDF function "udftest"
> which takes Text argument and return boolean value. The query is "select *
> from string_table where udftest(col) = true;". Error "FAILED: Unknown
> exception: null" returns.
>
>
>
> UDF function source code:
>
>
>
> package com.example;
>
>
>
> import org.apache.hadoop.hive.ql.exec.UDF;
>
> import org.apache.hadoop.io.Text;
>
>
>
> public final class UDFTest extends UDF {
>
>
>
>             public boolean evaluate(final Text s) {
>
>                         if (s == null) {
>
>                                     return false;
>
>                         }
>
>                         return true;
>
>                         }
>
> }
>
>
>
> On Wed, Jul 21, 2010 at 5:20 PM, Paul Yang <py...@facebook.com> wrote:
>
> I did notice that if the where clause is not a Boolean expression, there is
> a exception thrown – e.g. SELECT key FORM src WHERE 1; I filed a JIRA for
> this issue at:
>
>
>
> https://issues.apache.org/jira/browse/HIVE-1478
>
>
>
> Glad that your query works now, but “where f(col) = true” should not cause
> an error, as the = operator returns a boolean value. So it’s strange that
> you got an error… if you see this problem again, could you post a test case?
>
>
>
> *From:* Ping Zhu [mailto:ping@sharethis.com]
> *Sent:* Wednesday, July 21, 2010 5:10 PM
> *To:* hive-user@hadoop.apache.org
> *Subject:* Re: deploy simple UDF function
>
>
>
> It was "where f(col) = true". Sorry for the typo.
>
>
>
> Ping
>
> On Wed, Jul 21, 2010 at 5:03 PM, Paul Yang <py...@facebook.com> wrote:
>
> Hold on, how did ‘where f(col) is true’ compile? I don’t think “is true” is
> valid HQL. Can you post the full query?
>
>
>
> *From:* Ping Zhu [mailto:ping@sharethis.com]
> *Sent:* Wednesday, July 21, 2010 4:58 PM
> *To:* hive-user@hadoop.apache.org
> *Subject:* Re: deploy simple UDF function
>
>
>
> I figured the source of error: The UDF function (say, f) returns boolean
> value. The where clause in Hive query was "where f(col) is true)". I change
> the where clause to "where f(col)". Then it works.
>
>
>
> I did other contrived test by changing the return type of UDF to int. The
> where clause in Hive query is changed to "where f(col)=1". It also works.
>
>
>
> Is this an issue/bug of Hive compiler?
>
>
>
> Ping
>
> On Wed, Jul 21, 2010 at 3:55 PM, Ping Zhu <pi...@sharethis.com> wrote:
>
> I have tested this simple UDF function locally. The function itself is
> properly implemented.
>
>
>
> On Wed, Jul 21, 2010 at 3:54 PM, Ping Zhu <pi...@sharethis.com> wrote:
>
> Hi,
>
>
>
>   I have a problem with calling a simple UDF function in Hive query. I
> compiled the function and created a jar file on my local pc. Then the jar
> file is sent to a remote Hive cluster and deployed. When this UDF function
> is called in a Hive query, an error "FAILED: Unknown exception: null"
> returns. I checked Hive log file, the detailed error message is:
>
>
>
> 2010-07-21 15:45:33,590 ERROR ql.Driver (SessionState.java:printError(248))
> - FAILED: Unknown exception: null
>
> java.lang.NullPointerException
>
>         at
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFUtils$ConversionHelper.<init>(GenericUDFUtils.java:214)
>
>         at
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.initialize(GenericUDFBridge.java:140)
>
>         at
> org.apache.hadoop.hive.ql.plan.exprNodeGenericFuncDesc.newInstance(exprNodeGenericFuncDesc.java:141)
>
>         at
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getFuncExprNodeDesc(TypeCheckProcFactory.java:444)
>
>         at
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:541)
>
>         at
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:634)
>
>         at
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:80)
>
>         at
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:83)
>
>         at
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:117)
>
>         at
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:95)
>
>         at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:5283)
>
>         at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:1005)
>
>         at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:991)
>
>         at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:4234)
>
>         at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:4714)
>
>         at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5203)
>
>         at
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105)
>
>         at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275)
>
>         at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320)
>
>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312)
>
>         at
> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123)
>
>         at
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181)
>
>         at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287)
>
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>
>         at java.lang.reflect.Method.invoke(Method.java:597)
>
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>
>
>
>
>
>   The versions of Hive installed on my local pc and remote Hive cluster are
> 0.6 and 0.5 respectively. I copied corresponding jar files which are needed
> to compile the UDF function from remote Hive cluster, but it still does not
> work.
>
>
>
>   Any suggestions/comments will be highly appreciated.
>
>
>
>   Thanks and best regards,
>
>
>
>   Ping
>
>
>
>
>
>
>
>
>

RE: deploy simple UDF function

Posted by Paul Yang <py...@facebook.com>.
Hey Ping,

I just tried the same UDF/query but I am unable to reproduce that NPE. Which version of hive are you using?

Cheers,
Paul

From: Ping Zhu [mailto:ping@sharethis.com]
Sent: Wednesday, July 21, 2010 5:48 PM
To: hive-user@hadoop.apache.org
Subject: Re: deploy simple UDF function

This problem still exist. My small test case is:

I created a table string_table with one column of string type. I insert one record into table string_table. I create another UDF function "udftest" which takes Text argument and return boolean value. The query is "select * from string_table where udftest(col) = true;". Error "FAILED: Unknown exception: null" returns.

UDF function source code:

package com.example;

import org.apache.hadoop.hive.ql.exec.UDF;
import org.apache.hadoop.io.Text;

public final class UDFTest extends UDF {

            public boolean evaluate(final Text s) {
                        if (s == null) {
                                    return false;
                        }
                        return true;
                        }
}

On Wed, Jul 21, 2010 at 5:20 PM, Paul Yang <py...@facebook.com>> wrote:
I did notice that if the where clause is not a Boolean expression, there is a exception thrown - e.g. SELECT key FORM src WHERE 1; I filed a JIRA for this issue at:

https://issues.apache.org/jira/browse/HIVE-1478

Glad that your query works now, but "where f(col) = true" should not cause an error, as the = operator returns a boolean value. So it's strange that you got an error... if you see this problem again, could you post a test case?

From: Ping Zhu [mailto:ping@sharethis.com<ma...@sharethis.com>]
Sent: Wednesday, July 21, 2010 5:10 PM
To: hive-user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: deploy simple UDF function

It was "where f(col) = true". Sorry for the typo.

Ping
On Wed, Jul 21, 2010 at 5:03 PM, Paul Yang <py...@facebook.com>> wrote:
Hold on, how did 'where f(col) is true' compile? I don't think "is true" is valid HQL. Can you post the full query?

From: Ping Zhu [mailto:ping@sharethis.com<ma...@sharethis.com>]
Sent: Wednesday, July 21, 2010 4:58 PM
To: hive-user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: deploy simple UDF function

I figured the source of error: The UDF function (say, f) returns boolean value. The where clause in Hive query was "where f(col) is true)". I change the where clause to "where f(col)". Then it works.

I did other contrived test by changing the return type of UDF to int. The where clause in Hive query is changed to "where f(col)=1". It also works.

Is this an issue/bug of Hive compiler?

Ping
On Wed, Jul 21, 2010 at 3:55 PM, Ping Zhu <pi...@sharethis.com>> wrote:
I have tested this simple UDF function locally. The function itself is properly implemented.

On Wed, Jul 21, 2010 at 3:54 PM, Ping Zhu <pi...@sharethis.com>> wrote:
Hi,

  I have a problem with calling a simple UDF function in Hive query. I compiled the function and created a jar file on my local pc. Then the jar file is sent to a remote Hive cluster and deployed. When this UDF function is called in a Hive query, an error "FAILED: Unknown exception: null" returns. I checked Hive log file, the detailed error message is:

2010-07-21 15:45:33,590 ERROR ql.Driver (SessionState.java:printError(248)) - FAILED: Unknown exception: null
java.lang.NullPointerException
        at org.apache.hadoop.hive.ql.udf.generic.GenericUDFUtils$ConversionHelper.<init>(GenericUDFUtils.java:214)
        at org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.initialize(GenericUDFBridge.java:140)
        at org.apache.hadoop.hive.ql.plan.exprNodeGenericFuncDesc.newInstance(exprNodeGenericFuncDesc.java:141)
        at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getFuncExprNodeDesc(TypeCheckProcFactory.java:444)
        at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:541)
        at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:634)
        at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:80)
        at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:83)
        at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:117)
        at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:95)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:5283)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:1005)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:991)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:4234)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:4714)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5203)
        at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105)
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275)
        at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312)
        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)


  The versions of Hive installed on my local pc and remote Hive cluster are 0.6 and 0.5 respectively. I copied corresponding jar files which are needed to compile the UDF function from remote Hive cluster, but it still does not work.

  Any suggestions/comments will be highly appreciated.

  Thanks and best regards,

  Ping





Re: deploy simple UDF function

Posted by Ping Zhu <pi...@sharethis.com>.
This problem still exist. My small test case is:

I created a table string_table with one column of string type. I insert one
record into table string_table. I create another UDF function "udftest"
which takes Text argument and return boolean value. The query is "select *
from string_table where udftest(col) = true;". Error "FAILED: Unknown
exception: null" returns.

UDF function source code:

package com.example;

import org.apache.hadoop.hive.ql.exec.UDF;
import org.apache.hadoop.io.Text;

public final class UDFTest extends UDF {
 public boolean evaluate(final Text s) {
if (s == null) {
return false;
}
return true;
}
}

On Wed, Jul 21, 2010 at 5:20 PM, Paul Yang <py...@facebook.com> wrote:

>  I did notice that if the where clause is not a Boolean expression, there
> is a exception thrown – e.g. SELECT key FORM src WHERE 1; I filed a JIRA for
> this issue at:
>
>
>
> https://issues.apache.org/jira/browse/HIVE-1478
>
>
>
> Glad that your query works now, but “where f(col) = true” should not cause
> an error, as the = operator returns a boolean value. So it’s strange that
> you got an error… if you see this problem again, could you post a test case?
>
>
>
> *From:* Ping Zhu [mailto:ping@sharethis.com]
> *Sent:* Wednesday, July 21, 2010 5:10 PM
> *To:* hive-user@hadoop.apache.org
> *Subject:* Re: deploy simple UDF function
>
>
>
> It was "where f(col) = true". Sorry for the typo.
>
>
>
> Ping
>
> On Wed, Jul 21, 2010 at 5:03 PM, Paul Yang <py...@facebook.com> wrote:
>
> Hold on, how did ‘where f(col) is true’ compile? I don’t think “is true” is
> valid HQL. Can you post the full query?
>
>
>
> *From:* Ping Zhu [mailto:ping@sharethis.com]
> *Sent:* Wednesday, July 21, 2010 4:58 PM
> *To:* hive-user@hadoop.apache.org
> *Subject:* Re: deploy simple UDF function
>
>
>
> I figured the source of error: The UDF function (say, f) returns boolean
> value. The where clause in Hive query was "where f(col) is true)". I change
> the where clause to "where f(col)". Then it works.
>
>
>
> I did other contrived test by changing the return type of UDF to int. The
> where clause in Hive query is changed to "where f(col)=1". It also works.
>
>
>
> Is this an issue/bug of Hive compiler?
>
>
>
> Ping
>
> On Wed, Jul 21, 2010 at 3:55 PM, Ping Zhu <pi...@sharethis.com> wrote:
>
> I have tested this simple UDF function locally. The function itself is
> properly implemented.
>
>
>
> On Wed, Jul 21, 2010 at 3:54 PM, Ping Zhu <pi...@sharethis.com> wrote:
>
> Hi,
>
>
>
>   I have a problem with calling a simple UDF function in Hive query. I
> compiled the function and created a jar file on my local pc. Then the jar
> file is sent to a remote Hive cluster and deployed. When this UDF function
> is called in a Hive query, an error "FAILED: Unknown exception: null"
> returns. I checked Hive log file, the detailed error message is:
>
>
>
> 2010-07-21 15:45:33,590 ERROR ql.Driver (SessionState.java:printError(248))
> - FAILED: Unknown exception: null
>
> java.lang.NullPointerException
>
>         at
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFUtils$ConversionHelper.<init>(GenericUDFUtils.java:214)
>
>         at
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.initialize(GenericUDFBridge.java:140)
>
>         at
> org.apache.hadoop.hive.ql.plan.exprNodeGenericFuncDesc.newInstance(exprNodeGenericFuncDesc.java:141)
>
>         at
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getFuncExprNodeDesc(TypeCheckProcFactory.java:444)
>
>         at
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:541)
>
>         at
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:634)
>
>         at
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:80)
>
>         at
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:83)
>
>         at
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:117)
>
>         at
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:95)
>
>         at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:5283)
>
>         at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:1005)
>
>         at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:991)
>
>         at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:4234)
>
>         at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:4714)
>
>         at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5203)
>
>         at
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105)
>
>         at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275)
>
>         at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320)
>
>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312)
>
>         at
> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123)
>
>         at
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181)
>
>         at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287)
>
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>
>         at java.lang.reflect.Method.invoke(Method.java:597)
>
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>
>
>
>
>
>   The versions of Hive installed on my local pc and remote Hive cluster are
> 0.6 and 0.5 respectively. I copied corresponding jar files which are needed
> to compile the UDF function from remote Hive cluster, but it still does not
> work.
>
>
>
>   Any suggestions/comments will be highly appreciated.
>
>
>
>   Thanks and best regards,
>
>
>
>   Ping
>
>
>
>
>
>
>

RE: deploy simple UDF function

Posted by Paul Yang <py...@facebook.com>.
I did notice that if the where clause is not a Boolean expression, there is a exception thrown - e.g. SELECT key FORM src WHERE 1; I filed a JIRA for this issue at:

https://issues.apache.org/jira/browse/HIVE-1478

Glad that your query works now, but "where f(col) = true" should not cause an error, as the = operator returns a boolean value. So it's strange that you got an error... if you see this problem again, could you post a test case?

From: Ping Zhu [mailto:ping@sharethis.com]
Sent: Wednesday, July 21, 2010 5:10 PM
To: hive-user@hadoop.apache.org
Subject: Re: deploy simple UDF function

It was "where f(col) = true". Sorry for the typo.

Ping
On Wed, Jul 21, 2010 at 5:03 PM, Paul Yang <py...@facebook.com>> wrote:
Hold on, how did 'where f(col) is true' compile? I don't think "is true" is valid HQL. Can you post the full query?

From: Ping Zhu [mailto:ping@sharethis.com<ma...@sharethis.com>]
Sent: Wednesday, July 21, 2010 4:58 PM
To: hive-user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: deploy simple UDF function

I figured the source of error: The UDF function (say, f) returns boolean value. The where clause in Hive query was "where f(col) is true)". I change the where clause to "where f(col)". Then it works.

I did other contrived test by changing the return type of UDF to int. The where clause in Hive query is changed to "where f(col)=1". It also works.

Is this an issue/bug of Hive compiler?

Ping
On Wed, Jul 21, 2010 at 3:55 PM, Ping Zhu <pi...@sharethis.com>> wrote:
I have tested this simple UDF function locally. The function itself is properly implemented.

On Wed, Jul 21, 2010 at 3:54 PM, Ping Zhu <pi...@sharethis.com>> wrote:
Hi,

  I have a problem with calling a simple UDF function in Hive query. I compiled the function and created a jar file on my local pc. Then the jar file is sent to a remote Hive cluster and deployed. When this UDF function is called in a Hive query, an error "FAILED: Unknown exception: null" returns. I checked Hive log file, the detailed error message is:

2010-07-21 15:45:33,590 ERROR ql.Driver (SessionState.java:printError(248)) - FAILED: Unknown exception: null
java.lang.NullPointerException
        at org.apache.hadoop.hive.ql.udf.generic.GenericUDFUtils$ConversionHelper.<init>(GenericUDFUtils.java:214)
        at org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.initialize(GenericUDFBridge.java:140)
        at org.apache.hadoop.hive.ql.plan.exprNodeGenericFuncDesc.newInstance(exprNodeGenericFuncDesc.java:141)
        at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getFuncExprNodeDesc(TypeCheckProcFactory.java:444)
        at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:541)
        at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:634)
        at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:80)
        at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:83)
        at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:117)
        at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:95)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:5283)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:1005)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:991)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:4234)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:4714)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5203)
        at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105)
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275)
        at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312)
        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)


  The versions of Hive installed on my local pc and remote Hive cluster are 0.6 and 0.5 respectively. I copied corresponding jar files which are needed to compile the UDF function from remote Hive cluster, but it still does not work.

  Any suggestions/comments will be highly appreciated.

  Thanks and best regards,

  Ping




Re: deploy simple UDF function

Posted by Ping Zhu <pi...@sharethis.com>.
It was "where f(col) = true". Sorry for the typo.

Ping

On Wed, Jul 21, 2010 at 5:03 PM, Paul Yang <py...@facebook.com> wrote:

>  Hold on, how did ‘where f(col) is true’ compile? I don’t think “is true”
> is valid HQL. Can you post the full query?
>
>
>
> *From:* Ping Zhu [mailto:ping@sharethis.com]
> *Sent:* Wednesday, July 21, 2010 4:58 PM
> *To:* hive-user@hadoop.apache.org
> *Subject:* Re: deploy simple UDF function
>
>
>
> I figured the source of error: The UDF function (say, f) returns boolean
> value. The where clause in Hive query was "where f(col) is true)". I change
> the where clause to "where f(col)". Then it works.
>
>
>
> I did other contrived test by changing the return type of UDF to int. The
> where clause in Hive query is changed to "where f(col)=1". It also works.
>
>
>
> Is this an issue/bug of Hive compiler?
>
>
>
> Ping
>
> On Wed, Jul 21, 2010 at 3:55 PM, Ping Zhu <pi...@sharethis.com> wrote:
>
> I have tested this simple UDF function locally. The function itself is
> properly implemented.
>
>
>
> On Wed, Jul 21, 2010 at 3:54 PM, Ping Zhu <pi...@sharethis.com> wrote:
>
> Hi,
>
>
>
>   I have a problem with calling a simple UDF function in Hive query. I
> compiled the function and created a jar file on my local pc. Then the jar
> file is sent to a remote Hive cluster and deployed. When this UDF function
> is called in a Hive query, an error "FAILED: Unknown exception: null"
> returns. I checked Hive log file, the detailed error message is:
>
>
>
> 2010-07-21 15:45:33,590 ERROR ql.Driver (SessionState.java:printError(248))
> - FAILED: Unknown exception: null
>
> java.lang.NullPointerException
>
>         at
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFUtils$ConversionHelper.<init>(GenericUDFUtils.java:214)
>
>         at
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.initialize(GenericUDFBridge.java:140)
>
>         at
> org.apache.hadoop.hive.ql.plan.exprNodeGenericFuncDesc.newInstance(exprNodeGenericFuncDesc.java:141)
>
>         at
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getFuncExprNodeDesc(TypeCheckProcFactory.java:444)
>
>         at
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:541)
>
>         at
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:634)
>
>         at
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:80)
>
>         at
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:83)
>
>         at
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:117)
>
>         at
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:95)
>
>         at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:5283)
>
>         at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:1005)
>
>         at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:991)
>
>         at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:4234)
>
>         at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:4714)
>
>         at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5203)
>
>         at
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105)
>
>         at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275)
>
>         at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320)
>
>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312)
>
>         at
> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123)
>
>         at
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181)
>
>         at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287)
>
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>
>         at java.lang.reflect.Method.invoke(Method.java:597)
>
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>
>
>
>
>
>   The versions of Hive installed on my local pc and remote Hive cluster are
> 0.6 and 0.5 respectively. I copied corresponding jar files which are needed
> to compile the UDF function from remote Hive cluster, but it still does not
> work.
>
>
>
>   Any suggestions/comments will be highly appreciated.
>
>
>
>   Thanks and best regards,
>
>
>
>   Ping
>
>
>
>
>

RE: deploy simple UDF function

Posted by Paul Yang <py...@facebook.com>.
Hold on, how did 'where f(col) is true' compile? I don't think "is true" is valid HQL. Can you post the full query?

From: Ping Zhu [mailto:ping@sharethis.com]
Sent: Wednesday, July 21, 2010 4:58 PM
To: hive-user@hadoop.apache.org
Subject: Re: deploy simple UDF function

I figured the source of error: The UDF function (say, f) returns boolean value. The where clause in Hive query was "where f(col) is true)". I change the where clause to "where f(col)". Then it works.

I did other contrived test by changing the return type of UDF to int. The where clause in Hive query is changed to "where f(col)=1". It also works.

Is this an issue/bug of Hive compiler?

Ping
On Wed, Jul 21, 2010 at 3:55 PM, Ping Zhu <pi...@sharethis.com>> wrote:
I have tested this simple UDF function locally. The function itself is properly implemented.

On Wed, Jul 21, 2010 at 3:54 PM, Ping Zhu <pi...@sharethis.com>> wrote:
Hi,

  I have a problem with calling a simple UDF function in Hive query. I compiled the function and created a jar file on my local pc. Then the jar file is sent to a remote Hive cluster and deployed. When this UDF function is called in a Hive query, an error "FAILED: Unknown exception: null" returns. I checked Hive log file, the detailed error message is:

2010-07-21 15:45:33,590 ERROR ql.Driver (SessionState.java:printError(248)) - FAILED: Unknown exception: null
java.lang.NullPointerException
        at org.apache.hadoop.hive.ql.udf.generic.GenericUDFUtils$ConversionHelper.<init>(GenericUDFUtils.java:214)
        at org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.initialize(GenericUDFBridge.java:140)
        at org.apache.hadoop.hive.ql.plan.exprNodeGenericFuncDesc.newInstance(exprNodeGenericFuncDesc.java:141)
        at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getFuncExprNodeDesc(TypeCheckProcFactory.java:444)
        at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:541)
        at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:634)
        at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:80)
        at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:83)
        at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:117)
        at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:95)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:5283)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:1005)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:991)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:4234)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:4714)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5203)
        at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105)
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275)
        at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312)
        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)


  The versions of Hive installed on my local pc and remote Hive cluster are 0.6 and 0.5 respectively. I copied corresponding jar files which are needed to compile the UDF function from remote Hive cluster, but it still does not work.

  Any suggestions/comments will be highly appreciated.

  Thanks and best regards,

  Ping



Re: deploy simple UDF function

Posted by Ping Zhu <pi...@sharethis.com>.
I figured the source of error: The UDF function (say, f) returns boolean
value. The where clause in Hive query was "where f(col) is true)". I change
the where clause to "where f(col)". Then it works.

I did other contrived test by changing the return type of UDF to int. The
where clause in Hive query is changed to "where f(col)=1". It also works.

Is this an issue/bug of Hive compiler?

Ping

On Wed, Jul 21, 2010 at 3:55 PM, Ping Zhu <pi...@sharethis.com> wrote:

> I have tested this simple UDF function locally. The function itself is
> properly implemented.
>
>
> On Wed, Jul 21, 2010 at 3:54 PM, Ping Zhu <pi...@sharethis.com> wrote:
>
>> Hi,
>>
>>   I have a problem with calling a simple UDF function in Hive query. I
>> compiled the function and created a jar file on my local pc. Then the jar
>> file is sent to a remote Hive cluster and deployed. When this UDF function
>> is called in a Hive query, an error "FAILED: Unknown exception: null"
>> returns. I checked Hive log file, the detailed error message is:
>>
>> 2010-07-21 15:45:33,590 ERROR ql.Driver
>> (SessionState.java:printError(248)) - FAILED: Unknown exception: null
>> java.lang.NullPointerException
>>         at
>> org.apache.hadoop.hive.ql.udf.generic.GenericUDFUtils$ConversionHelper.<init>(GenericUDFUtils.java:214)
>>         at
>> org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.initialize(GenericUDFBridge.java:140)
>>         at
>> org.apache.hadoop.hive.ql.plan.exprNodeGenericFuncDesc.newInstance(exprNodeGenericFuncDesc.java:141)
>>         at
>> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getFuncExprNodeDesc(TypeCheckProcFactory.java:444)
>>         at
>> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:541)
>>         at
>> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:634)
>>         at
>> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:80)
>>         at
>> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:83)
>>         at
>> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:117)
>>         at
>> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:95)
>>         at
>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:5283)
>>         at
>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:1005)
>>         at
>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:991)
>>         at
>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:4234)
>>         at
>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:4714)
>>         at
>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5203)
>>          at
>> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105)
>>         at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275)
>>         at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320)
>>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312)
>>         at
>> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123)
>>         at
>> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181)
>>         at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287)
>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>         at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>         at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>         at java.lang.reflect.Method.invoke(Method.java:597)
>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>>
>>
>>   The versions of Hive installed on my local pc and remote Hive cluster
>> are 0.6 and 0.5 respectively. I copied corresponding jar files which are
>> needed to compile the UDF function from remote Hive cluster, but it still
>> does not work.
>>
>>   Any suggestions/comments will be highly appreciated.
>>
>>   Thanks and best regards,
>>
>>   Ping
>>
>
>

Re: deploy simple UDF function

Posted by Ping Zhu <pi...@sharethis.com>.
I have tested this simple UDF function locally. The function itself is
properly implemented.

On Wed, Jul 21, 2010 at 3:54 PM, Ping Zhu <pi...@sharethis.com> wrote:

> Hi,
>
>   I have a problem with calling a simple UDF function in Hive query. I
> compiled the function and created a jar file on my local pc. Then the jar
> file is sent to a remote Hive cluster and deployed. When this UDF function
> is called in a Hive query, an error "FAILED: Unknown exception: null"
> returns. I checked Hive log file, the detailed error message is:
>
> 2010-07-21 15:45:33,590 ERROR ql.Driver (SessionState.java:printError(248))
> - FAILED: Unknown exception: null
> java.lang.NullPointerException
>         at
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFUtils$ConversionHelper.<init>(GenericUDFUtils.java:214)
>         at
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.initialize(GenericUDFBridge.java:140)
>         at
> org.apache.hadoop.hive.ql.plan.exprNodeGenericFuncDesc.newInstance(exprNodeGenericFuncDesc.java:141)
>         at
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getFuncExprNodeDesc(TypeCheckProcFactory.java:444)
>         at
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:541)
>         at
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:634)
>         at
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:80)
>         at
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:83)
>         at
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:117)
>         at
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:95)
>         at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:5283)
>         at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:1005)
>         at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:991)
>         at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:4234)
>         at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:4714)
>         at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5203)
>         at
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105)
>         at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275)
>         at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320)
>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312)
>         at
> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123)
>         at
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181)
>         at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>
>
>   The versions of Hive installed on my local pc and remote Hive cluster are
> 0.6 and 0.5 respectively. I copied corresponding jar files which are needed
> to compile the UDF function from remote Hive cluster, but it still does not
> work.
>
>   Any suggestions/comments will be highly appreciated.
>
>   Thanks and best regards,
>
>   Ping
>