You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Alex <si...@gmail.com> on 2017/02/02 08:33:29 UTC

Is it okay to run Hive Java UDFS in Spark-sql. Anybody's still doing it?

Hi Team,

Do you really think if we make Hive Java UDF's to run on spark-sql  it will
make performance difference???  IS anybody here actually doing it..
converting Hive UDF's to run on Spark-sql..

What would be your approach if asked to make Hive Java UDFS project run on
spark-sql

Would yu run the same java UDF using Spark-sql

or

You would recode all java UDF to scala UDF and then run?


Regards,
Alex

Re: Is it okay to run Hive Java UDFS in Spark-sql. Anybody's still doing it?

Posted by Jörn Franke <jo...@gmail.com>.

There are many performance aspects here which may not only related to the UDF itself, but on configuration of platform, data etc.

You seem to have a performance problem with your UDFs. Maybe you can elaborate on 
1) what data you process (format, etc)
2) what you try to Analyse
3) how you implemented your udfs. Maybe the implementation is not optimal and then simply moving it from hive to spark does not give you any benefits. Bad code is still bad code in SparkSql

> On 2 Feb 2017, at 09:33, Alex <si...@gmail.com> wrote:
> 
> Hi Team,
> 
> Do you really think if we make Hive Java UDF's to run on spark-sql  it will make performance difference???  IS anybody here actually doing it.. converting Hive UDF's to run on Spark-sql..
> 
> What would be your approach if asked to make Hive Java UDFS project run on spark-sql
> 
> Would yu run the same java UDF using Spark-sql
>  
> or 
> 
> You would recode all java UDF to scala UDF and then run?
> 
> 
> Regards,
> Alex

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org