You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by shahab <sh...@gmail.com> on 2015/03/10 10:44:24 UTC

Registering custom UDAFs with HiveConetxt in SparkSQL, how?

Hi,

I need o develop couple of UDAFs and use them in the SparkSQL. While UDFs
can be registered as a function in HiveContext, I could not find any
documentation of how UDAFs can be registered in the HiveContext?? so far
what I have found is to make a JAR file, out of developed UDAF class, and
then deploy the JAR file to SparkSQL .

But is there any way to avoid deploying the jar file and register it
programmatically?


best,
/Shahab

Re: Registering custom UDAFs with HiveConetxt in SparkSQL, how?

Posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com.
Hello Shahab,
Are you able to read tables created in Hive from Spark SQL ? If yes, how
are you referring them ?


On Thu, Mar 26, 2015 at 1:11 PM, Takeshi Yamamuro <li...@gmail.com>
wrote:

> I think it is not `sqlContext` but hiveContext because `create temporary
> function` is not supported in SQLContext.
>
> On Wed, Mar 25, 2015 at 5:58 AM, Jon Chase <jo...@gmail.com> wrote:
>
>> Shahab -
>>
>> This should do the trick until Hao's changes are out:
>>
>>
>> sqlContext.sql("create temporary function foobar as
>> 'com.myco.FoobarUDAF'");
>>
>> sqlContext.sql("select foobar(some_column) from some_table");
>>
>>
>> This works without requiring to 'deploy' a JAR with the UDAF in it - just
>> make sure the UDAF is in your project's classpath.
>>
>>
>>
>>
>> On Tue, Mar 10, 2015 at 8:21 PM, Cheng, Hao <ha...@intel.com> wrote:
>>
>>>  Oh, sorry, my bad, currently Spark SQL doesn’t provide the user
>>> interface for UDAF, but it can work seamlessly with Hive UDAF (via
>>> HiveContext).
>>>
>>>
>>>
>>> I am also working on the UDAF interface refactoring, after that we can
>>> provide the custom interface for extension.
>>>
>>>
>>>
>>> https://github.com/apache/spark/pull/3247
>>>
>>>
>>>
>>>
>>>
>>> *From:* shahab [mailto:shahab.mokari@gmail.com]
>>> *Sent:* Wednesday, March 11, 2015 1:44 AM
>>> *To:* Cheng, Hao
>>> *Cc:* user@spark.apache.org
>>> *Subject:* Re: Registering custom UDAFs with HiveConetxt in SparkSQL,
>>> how?
>>>
>>>
>>>
>>> Thanks Hao,
>>>
>>> But my question concerns UDAF (user defined aggregation function ) not
>>> UDTF( user defined type function ).
>>>
>>> I appreciate if you could point me to some starting point on UDAF
>>> development in Spark.
>>>
>>>
>>>
>>> Thanks
>>>
>>> Shahab
>>>
>>> On Tuesday, March 10, 2015, Cheng, Hao <ha...@intel.com> wrote:
>>>
>>>  Currently, Spark SQL doesn’t provide interface for developing the
>>> custom UDTF, but it can work seamless with Hive UDTF.
>>>
>>>
>>>
>>> I am working on the UDTF refactoring for Spark SQL, hopefully will
>>> provide an Hive independent UDTF soon after that.
>>>
>>>
>>>
>>> *From:* shahab [mailto:shahab.mokari@gmail.com]
>>> *Sent:* Tuesday, March 10, 2015 5:44 PM
>>> *To:* user@spark.apache.org
>>> *Subject:* Registering custom UDAFs with HiveConetxt in SparkSQL, how?
>>>
>>>
>>>
>>> Hi,
>>>
>>>
>>>
>>> I need o develop couple of UDAFs and use them in the SparkSQL. While
>>> UDFs can be registered as a function in HiveContext, I could not find any
>>> documentation of how UDAFs can be registered in the HiveContext?? so far
>>> what I have found is to make a JAR file, out of developed UDAF class, and
>>> then deploy the JAR file to SparkSQL .
>>>
>>>
>>>
>>> But is there any way to avoid deploying the jar file and register it
>>> programmatically?
>>>
>>>
>>>
>>>
>>>
>>> best,
>>>
>>> /Shahab
>>>
>>>
>>
>
>
> --
> ---
> Takeshi Yamamuro
>



-- 
Deepak

Re: Registering custom UDAFs with HiveConetxt in SparkSQL, how?

Posted by Takeshi Yamamuro <li...@gmail.com>.
I think it is not `sqlContext` but hiveContext because `create temporary
function` is not supported in SQLContext.

On Wed, Mar 25, 2015 at 5:58 AM, Jon Chase <jo...@gmail.com> wrote:

> Shahab -
>
> This should do the trick until Hao's changes are out:
>
>
> sqlContext.sql("create temporary function foobar as
> 'com.myco.FoobarUDAF'");
>
> sqlContext.sql("select foobar(some_column) from some_table");
>
>
> This works without requiring to 'deploy' a JAR with the UDAF in it - just
> make sure the UDAF is in your project's classpath.
>
>
>
>
> On Tue, Mar 10, 2015 at 8:21 PM, Cheng, Hao <ha...@intel.com> wrote:
>
>>  Oh, sorry, my bad, currently Spark SQL doesn’t provide the user
>> interface for UDAF, but it can work seamlessly with Hive UDAF (via
>> HiveContext).
>>
>>
>>
>> I am also working on the UDAF interface refactoring, after that we can
>> provide the custom interface for extension.
>>
>>
>>
>> https://github.com/apache/spark/pull/3247
>>
>>
>>
>>
>>
>> *From:* shahab [mailto:shahab.mokari@gmail.com]
>> *Sent:* Wednesday, March 11, 2015 1:44 AM
>> *To:* Cheng, Hao
>> *Cc:* user@spark.apache.org
>> *Subject:* Re: Registering custom UDAFs with HiveConetxt in SparkSQL,
>> how?
>>
>>
>>
>> Thanks Hao,
>>
>> But my question concerns UDAF (user defined aggregation function ) not
>> UDTF( user defined type function ).
>>
>> I appreciate if you could point me to some starting point on UDAF
>> development in Spark.
>>
>>
>>
>> Thanks
>>
>> Shahab
>>
>> On Tuesday, March 10, 2015, Cheng, Hao <ha...@intel.com> wrote:
>>
>>  Currently, Spark SQL doesn’t provide interface for developing the
>> custom UDTF, but it can work seamless with Hive UDTF.
>>
>>
>>
>> I am working on the UDTF refactoring for Spark SQL, hopefully will
>> provide an Hive independent UDTF soon after that.
>>
>>
>>
>> *From:* shahab [mailto:shahab.mokari@gmail.com]
>> *Sent:* Tuesday, March 10, 2015 5:44 PM
>> *To:* user@spark.apache.org
>> *Subject:* Registering custom UDAFs with HiveConetxt in SparkSQL, how?
>>
>>
>>
>> Hi,
>>
>>
>>
>> I need o develop couple of UDAFs and use them in the SparkSQL. While UDFs
>> can be registered as a function in HiveContext, I could not find any
>> documentation of how UDAFs can be registered in the HiveContext?? so far
>> what I have found is to make a JAR file, out of developed UDAF class, and
>> then deploy the JAR file to SparkSQL .
>>
>>
>>
>> But is there any way to avoid deploying the jar file and register it
>> programmatically?
>>
>>
>>
>>
>>
>> best,
>>
>> /Shahab
>>
>>
>


-- 
---
Takeshi Yamamuro

Re: Registering custom UDAFs with HiveConetxt in SparkSQL, how?

Posted by Jon Chase <jo...@gmail.com>.
Shahab -

This should do the trick until Hao's changes are out:


sqlContext.sql("create temporary function foobar as 'com.myco.FoobarUDAF'");

sqlContext.sql("select foobar(some_column) from some_table");


This works without requiring to 'deploy' a JAR with the UDAF in it - just
make sure the UDAF is in your project's classpath.




On Tue, Mar 10, 2015 at 8:21 PM, Cheng, Hao <ha...@intel.com> wrote:

>  Oh, sorry, my bad, currently Spark SQL doesn't provide the user
> interface for UDAF, but it can work seamlessly with Hive UDAF (via
> HiveContext).
>
>
>
> I am also working on the UDAF interface refactoring, after that we can
> provide the custom interface for extension.
>
>
>
> https://github.com/apache/spark/pull/3247
>
>
>
>
>
> *From:* shahab [mailto:shahab.mokari@gmail.com]
> *Sent:* Wednesday, March 11, 2015 1:44 AM
> *To:* Cheng, Hao
> *Cc:* user@spark.apache.org
> *Subject:* Re: Registering custom UDAFs with HiveConetxt in SparkSQL, how?
>
>
>
> Thanks Hao,
>
> But my question concerns UDAF (user defined aggregation function ) not
> UDTF( user defined type function ).
>
> I appreciate if you could point me to some starting point on UDAF
> development in Spark.
>
>
>
> Thanks
>
> Shahab
>
> On Tuesday, March 10, 2015, Cheng, Hao <ha...@intel.com> wrote:
>
>  Currently, Spark SQL doesn't provide interface for developing the custom
> UDTF, but it can work seamless with Hive UDTF.
>
>
>
> I am working on the UDTF refactoring for Spark SQL, hopefully will provide
> an Hive independent UDTF soon after that.
>
>
>
> *From:* shahab [mailto:shahab.mokari@gmail.com]
> *Sent:* Tuesday, March 10, 2015 5:44 PM
> *To:* user@spark.apache.org
> *Subject:* Registering custom UDAFs with HiveConetxt in SparkSQL, how?
>
>
>
> Hi,
>
>
>
> I need o develop couple of UDAFs and use them in the SparkSQL. While UDFs
> can be registered as a function in HiveContext, I could not find any
> documentation of how UDAFs can be registered in the HiveContext?? so far
> what I have found is to make a JAR file, out of developed UDAF class, and
> then deploy the JAR file to SparkSQL .
>
>
>
> But is there any way to avoid deploying the jar file and register it
> programmatically?
>
>
>
>
>
> best,
>
> /Shahab
>
>

RE: Registering custom UDAFs with HiveConetxt in SparkSQL, how?

Posted by "Cheng, Hao" <ha...@intel.com>.
Oh, sorry, my bad, currently Spark SQL doesn’t provide the user interface for UDAF, but it can work seamlessly with Hive UDAF (via HiveContext).

I am also working on the UDAF interface refactoring, after that we can provide the custom interface for extension.

https://github.com/apache/spark/pull/3247


From: shahab [mailto:shahab.mokari@gmail.com]
Sent: Wednesday, March 11, 2015 1:44 AM
To: Cheng, Hao
Cc: user@spark.apache.org
Subject: Re: Registering custom UDAFs with HiveConetxt in SparkSQL, how?

Thanks Hao,
But my question concerns UDAF (user defined aggregation function ) not UDTF( user defined type function ).
I appreciate if you could point me to some starting point on UDAF development in Spark.

Thanks
Shahab

On Tuesday, March 10, 2015, Cheng, Hao <ha...@intel.com>> wrote:
Currently, Spark SQL doesn’t provide interface for developing the custom UDTF, but it can work seamless with Hive UDTF.

I am working on the UDTF refactoring for Spark SQL, hopefully will provide an Hive independent UDTF soon after that.

From: shahab [mailto:shahab.mokari@gmail.com<javascript:_e(%7B%7D,'cvml','shahab.mokari@gmail.com');>]
Sent: Tuesday, March 10, 2015 5:44 PM
To: user@spark.apache.org<javascript:_e(%7B%7D,'cvml','user@spark.apache.org');>
Subject: Registering custom UDAFs with HiveConetxt in SparkSQL, how?

Hi,

I need o develop couple of UDAFs and use them in the SparkSQL. While UDFs can be registered as a function in HiveContext, I could not find any documentation of how UDAFs can be registered in the HiveContext?? so far what I have found is to make a JAR file, out of developed UDAF class, and then deploy the JAR file to SparkSQL .

But is there any way to avoid deploying the jar file and register it programmatically?


best,
/Shahab

Re: Registering custom UDAFs with HiveConetxt in SparkSQL, how?

Posted by shahab <sh...@gmail.com>.
Thanks Hao,
But my question concerns UDAF (user defined aggregation function ) not
UDTF( user defined type function ).
I appreciate if you could point me to some starting point on UDAF
development in Spark.

Thanks
Shahab

On Tuesday, March 10, 2015, Cheng, Hao <ha...@intel.com> wrote:

>  Currently, Spark SQL doesn’t provide interface for developing the custom
> UDTF, but it can work seamless with Hive UDTF.
>
>
>
> I am working on the UDTF refactoring for Spark SQL, hopefully will provide
> an Hive independent UDTF soon after that.
>
>
>
> *From:* shahab [mailto:shahab.mokari@gmail.com
> <javascript:_e(%7B%7D,'cvml','shahab.mokari@gmail.com');>]
> *Sent:* Tuesday, March 10, 2015 5:44 PM
> *To:* user@spark.apache.org
> <javascript:_e(%7B%7D,'cvml','user@spark.apache.org');>
> *Subject:* Registering custom UDAFs with HiveConetxt in SparkSQL, how?
>
>
>
> Hi,
>
>
>
> I need o develop couple of UDAFs and use them in the SparkSQL. While UDFs
> can be registered as a function in HiveContext, I could not find any
> documentation of how UDAFs can be registered in the HiveContext?? so far
> what I have found is to make a JAR file, out of developed UDAF class, and
> then deploy the JAR file to SparkSQL .
>
>
>
> But is there any way to avoid deploying the jar file and register it
> programmatically?
>
>
>
>
>
> best,
>
> /Shahab
>

RE: Registering custom UDAFs with HiveConetxt in SparkSQL, how?

Posted by "Cheng, Hao" <ha...@intel.com>.
Currently, Spark SQL doesn’t provide interface for developing the custom UDTF, but it can work seamless with Hive UDTF.

I am working on the UDTF refactoring for Spark SQL, hopefully will provide an Hive independent UDTF soon after that.

From: shahab [mailto:shahab.mokari@gmail.com]
Sent: Tuesday, March 10, 2015 5:44 PM
To: user@spark.apache.org
Subject: Registering custom UDAFs with HiveConetxt in SparkSQL, how?

Hi,

I need o develop couple of UDAFs and use them in the SparkSQL. While UDFs can be registered as a function in HiveContext, I could not find any documentation of how UDAFs can be registered in the HiveContext?? so far what I have found is to make a JAR file, out of developed UDAF class, and then deploy the JAR file to SparkSQL .

But is there any way to avoid deploying the jar file and register it programmatically?


best,
/Shahab