You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Chen Song <ch...@yahoo.com> on 2011/10/31 17:15:14 UTC

pass entire row as parameter in hive UDF

Hi All

In HIVE, I would like to write a UDF that accepts a sequence of parameters. Due to that the number of parameters is large and the particular function that I am writing is specific to a set of tables (joined in some way in the SQL), I am wondering if there is a way to pass the entire row as a wildcard parameter and then queried for its fields in UDF within this context, as shown in the below example.


select my_function(*) as my_column from t1, t2, etc where [a set of join conditions].

I did some investigation and found there was a JIRA opened for this.

https://issues.apache.org/jira/browse/HIVE-1459

This ticket is opened as a follow up to ticket HIVE-287 to support star expansion in general and seems still open. If anyone knows a way to pass the entire row as a context in UDF, that would be very helpful.


Regards,
Chen

Re: pass entire row as parameter in hive UDF

Posted by Chen Song <ch...@yahoo.com>.
Can this be only used in regular select statement or also as arguments to UDF? In this case, how shall I define my UDF/GenericUDF method signature to accept column in this form? Will Hive automatically expand the column list and pass them to customized UDF?


If there is any example, that would be very helpful.

Thanks
Chen


________________________________
From: Steven Wong <sw...@netflix.com>
To: "user@hive.apache.org" <us...@hive.apache.org>; hive dev list <de...@hive.apache.org>
Sent: Tuesday, November 1, 2011 10:20 PM
Subject: RE: pass entire row as parameter in hive UDF


Would https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Select#LanguageManualSelect-REGEXColumnSpecification work for you in the meantime?
 
 
From:Chen Song [mailto:chens_albany@yahoo.com] 
Sent: Monday, October 31, 2011 9:15 AM
To: hive dev list; hive user list
Subject: pass entire row as parameter in hive UDF
 
Hi All
 
In HIVE, I would like to write a UDF that accepts a sequence of parameters. Due to that the number of parameters is large and the particular function that I am writing is specific to a set of tables (joined in some way in the SQL), I am wondering if there is a way to pass the entire row as a wildcard parameter and then queried for its fields in UDF within this context, as shown in the below example.
 
select my_function(*) as my_column from t1, t2, etc where [a set of join conditions].
 
I did some investigation and found there was a JIRA opened for this.
 
https://issues.apache.org/jira/browse/HIVE-1459
 
This ticket is opened as a follow up to ticket HIVE-287 to support star expansion in general and seems still open. If anyone knows a way to pass the entire row as a context in UDF, that would be very helpful.
 
Regards,
Chen

Re: pass entire row as parameter in hive UDF

Posted by Chen Song <ch...@yahoo.com>.
Can this be only used in regular select statement or also as arguments to UDF? In this case, how shall I define my UDF/GenericUDF method signature to accept column in this form? Will Hive automatically expand the column list and pass them to customized UDF?


If there is any example, that would be very helpful.

Thanks
Chen


________________________________
From: Steven Wong <sw...@netflix.com>
To: "user@hive.apache.org" <us...@hive.apache.org>; hive dev list <de...@hive.apache.org>
Sent: Tuesday, November 1, 2011 10:20 PM
Subject: RE: pass entire row as parameter in hive UDF


Would https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Select#LanguageManualSelect-REGEXColumnSpecification work for you in the meantime?
 
 
From:Chen Song [mailto:chens_albany@yahoo.com] 
Sent: Monday, October 31, 2011 9:15 AM
To: hive dev list; hive user list
Subject: pass entire row as parameter in hive UDF
 
Hi All
 
In HIVE, I would like to write a UDF that accepts a sequence of parameters. Due to that the number of parameters is large and the particular function that I am writing is specific to a set of tables (joined in some way in the SQL), I am wondering if there is a way to pass the entire row as a wildcard parameter and then queried for its fields in UDF within this context, as shown in the below example.
 
select my_function(*) as my_column from t1, t2, etc where [a set of join conditions].
 
I did some investigation and found there was a JIRA opened for this.
 
https://issues.apache.org/jira/browse/HIVE-1459
 
This ticket is opened as a follow up to ticket HIVE-287 to support star expansion in general and seems still open. If anyone knows a way to pass the entire row as a context in UDF, that would be very helpful.
 
Regards,
Chen

RE: pass entire row as parameter in hive UDF

Posted by Steven Wong <sw...@netflix.com>.
Would https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Select#LanguageManualSelect-REGEXColumnSpecification work for you in the meantime?


From: Chen Song [mailto:chens_albany@yahoo.com]
Sent: Monday, October 31, 2011 9:15 AM
To: hive dev list; hive user list
Subject: pass entire row as parameter in hive UDF

Hi All

In HIVE, I would like to write a UDF that accepts a sequence of parameters. Due to that the number of parameters is large and the particular function that I am writing is specific to a set of tables (joined in some way in the SQL), I am wondering if there is a way to pass the entire row as a wildcard parameter and then queried for its fields in UDF within this context, as shown in the below example.

select my_function(*) as my_column from t1, t2, etc where [a set of join conditions].

I did some investigation and found there was a JIRA opened for this.

https://issues.apache.org/jira/browse/HIVE-1459

This ticket is opened as a follow up to ticket HIVE-287 to support star expansion in general and seems still open. If anyone knows a way to pass the entire row as a context in UDF, that would be very helpful.

Regards,
Chen

RE: pass entire row as parameter in hive UDF

Posted by Steven Wong <sw...@netflix.com>.
Would https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Select#LanguageManualSelect-REGEXColumnSpecification work for you in the meantime?


From: Chen Song [mailto:chens_albany@yahoo.com]
Sent: Monday, October 31, 2011 9:15 AM
To: hive dev list; hive user list
Subject: pass entire row as parameter in hive UDF

Hi All

In HIVE, I would like to write a UDF that accepts a sequence of parameters. Due to that the number of parameters is large and the particular function that I am writing is specific to a set of tables (joined in some way in the SQL), I am wondering if there is a way to pass the entire row as a wildcard parameter and then queried for its fields in UDF within this context, as shown in the below example.

select my_function(*) as my_column from t1, t2, etc where [a set of join conditions].

I did some investigation and found there was a JIRA opened for this.

https://issues.apache.org/jira/browse/HIVE-1459

This ticket is opened as a follow up to ticket HIVE-287 to support star expansion in general and seems still open. If anyone knows a way to pass the entire row as a context in UDF, that would be very helpful.

Regards,
Chen