You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Georg Heiler <ge...@gmail.com> on 2018/05/03 17:07:59 UTC

updating hive UDF without restarting the cluster

Hi,

I want to update Hive UDFs without requiring a restart of hive. According
to:
https://www.cloudera.com/documentation/enterprise/5-14-x/topics/cm_mc_hive_udf.html#concept_zb2_rxr_lw
 setting
hive.reloadable.aux.jars.path
is required. I have set it to /user/hive/libs/udf (which resides on HDFS).
However following their documentation I see:
file:///usr/lib/hive/lib/foo.jar which is confusing me. Does this property
only work for files residing on the local file system? Do I understand
correctly. that I should execute beelines  reload manually? Also in case
this property works for HDFS does it automatically pick up the classes in
the jar (load them) and no longer requires to specify the CREATE FUNCTION
foo AS 'my/path/to/jar-1.jar'?

Desired behaviour:
1. copy jar to HDFS /user/hive/lib/udf/foo-1.jar
2. add function to hive:
DROP FUNCTION IF EXISTS foo;
CREATE FUNCTION foo AS 'my.class.path.in.jar.FooUDF' USING JAR
'/user/hive/lib/udf/foo-1.jar'
3. add a new jar to HDFS /user/hive/lib/udf/foo-2.jar
4. update function in hive:
DROP FUNCTION IF EXISTS foo;
CREATE FUNCTION foo AS 'my.class.path.in.jar.FooUDF' USING JAR
'/user/hive/lib/udf/foo-2.jar'

This currently does not work and requires a restart of hive. Currently, it
results in Round Robin partially seeing the updated UDF.

How can I get hive to not require a restart when updating UDF?

Best,
Georg