You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Abhishek Gupta <ab...@gmail.com> on 2021/04/12 05:20:10 UTC

Add Hive AUX Jars across the cluster

Hello Community,

We are facing an issue while doing count (aggregate) queries in Hive for
Delta format tables (https://github.com/delta-io/connectors) using both
TEZ/MR, where it complains
Caused by: java.lang.RuntimeException: java.lang.RuntimeException:
java.lang.ClassNotFoundException: Class io.delta.hive.HiveInputFormat not
found

Simple select queries are working fine. The problem seems to be related to
the Delta Hive Jar dependency only being available in the local Hive CLI
classpath and not in the distributed cluster mode which causes
distributed count TEZ job to fail. The steps to add the hive.aux.jars.path to
Hive is described here https://github.com/delta-io/connectors/issues/84

Need some help with what I am doing wrong and the correct way to add 3rd
party Jars so that they are available in the classpath across the entire
cluster.

Thanks,
Abhishek

Re: Add Hive AUX Jars across the cluster

Posted by Peter Vary <pv...@cloudera.com>.
Hi Abhishek,

You might want to take a look at https://tez.apache.org/install.html <https://tez.apache.org/install.html>. I think the interesting part starts with "Various ways to configure tez.lib.uris".
In our case we had to update the tez tarball on HDFS by adding the new jars to it, but it really depends on your installation, and how your cluster has been set up. 

In the meantime you can use the simple "ADD JAR ..." command to test out if the issue is really with the classpath.

Thanks,
Peter

> On Apr 12, 2021, at 07:20, Abhishek Gupta <ab...@gmail.com> wrote:
> 
> Hello Community,
> 
> We are facing an issue while doing count (aggregate) queries in Hive for Delta format tables (https://github.com/delta-io/connectors <https://github.com/delta-io/connectors>) using both TEZ/MR, where it complains
> Caused by: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class io.delta.hive.HiveInputFormat not found
> 
> Simple select queries are working fine. The problem seems to be related to the Delta Hive Jar dependency only being available in the local Hive CLI classpath and not in the distributed cluster mode which causes distributed count TEZ job to fail. The steps to add the hive.aux.jars.path to Hive is described here https://github.com/delta-io/connectors/issues/84 <https://github.com/delta-io/connectors/issues/84>
> 
> Need some help with what I am doing wrong and the correct way to add 3rd party Jars so that they are available in the classpath across the entire cluster.
> 
> Thanks,
> Abhishek