You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Mitesh Peshave <ms...@gmail.com> on 2013/07/18 06:14:56 UTC

Hive does not package a custom inputformat in the MR job jar when the custom inputformat class is add as aux jar.

Hello,

I am trying to use a custom inputformat for a hive table.

When I add the jar containing the custom inputformat through a client, such
as the beeline, executing "add jar" command, all seems to work fine. In
this scenario, hive seems to pass inputformat class to the JT and TTs. I
believe, it correctly adds the jar to the distributed cache, and the MR
jobs complete without any errors.

But when I add the jar containing the custom input format under hive
auxlibs diror the hive lib dir, hive does not seem to pass the inputformat
class to the JT and TTs, causing the MR jobs to fails with
ClassNotFoundException.

The use-case I am looking at here is, multiple users connecting to the
HiveServer using hive clients and query a table that uses the a custom
inputformat. I would not want each user to add the jar executing the "add
jar" command before the users start querying the table.

Is there a way to add extra jars to the hive server once and force the
server to push these jars to JT for every MR jobs it generates?

Appreciate,
Mitesh

Re: Hive does not package a custom inputformat in the MR job jar when the custom inputformat class is add as aux jar.

Posted by Matouk IFTISSEN <ma...@ysance.com>.
that is what I search for a long time, and no responses. But if you are not
in the cloud (AWS, Azure,...) you can add the jar for your all Datanodes in
$HADOOP_HOME/lib ,  and then restart  the service mapreduce-tasktracker like
this

/etc/init.d/*mapreduce-tasktracker stop

/etc/init.d/*mapreduce-tasktracker start

Hope this help you ;)


2013/7/18 Andrew Trask <an...@digitalreasoning.com>

> Put them in hive's lib folder?
>
> Sent from my Rotary Phone
>
> On Jul 17, 2013, at 11:14 PM, Mitesh Peshave <ms...@gmail.com> wrote:
>
> > Hello,
> >
> > I am trying to use a custom inputformat for a hive table.
> >
> > When I add the jar containing the custom inputformat through a client,
> such as the beeline, executing "add jar" command, all seems to work fine.
> In this scenario, hive seems to pass inputformat class to the JT and TTs. I
> believe, it correctly adds the jar to the distributed cache, and the MR
> jobs complete without any errors.
> >
> > But when I add the jar containing the custom input format under hive
> auxlibs diror the hive lib dir, hive does not seem to pass the inputformat
> class to the JT and TTs, causing the MR jobs to fails with
> ClassNotFoundException.
> >
> > The use-case I am looking at here is, multiple users connecting to the
> HiveServer using hive clients and query a table that uses the a custom
> inputformat. I would not want each user to add the jar executing the "add
> jar" command before the users start querying the table.
> >
> > Is there a way to add extra jars to the hive server once and force the
> server to push these jars to JT for every MR jobs it generates?
> >
> > Appreciate,
> > Mitesh
>

Re: Hive does not package a custom inputformat in the MR job jar when the custom inputformat class is add as aux jar.

Posted by Andrew Trask <an...@digitalreasoning.com>.
Put them in hive's lib folder?

Sent from my Rotary Phone

On Jul 17, 2013, at 11:14 PM, Mitesh Peshave <ms...@gmail.com> wrote:

> Hello,
> 
> I am trying to use a custom inputformat for a hive table. 
> 
> When I add the jar containing the custom inputformat through a client, such as the beeline, executing "add jar" command, all seems to work fine. In this scenario, hive seems to pass inputformat class to the JT and TTs. I believe, it correctly adds the jar to the distributed cache, and the MR jobs complete without any errors.
> 
> But when I add the jar containing the custom input format under hive auxlibs diror the hive lib dir, hive does not seem to pass the inputformat class to the JT and TTs, causing the MR jobs to fails with ClassNotFoundException.
> 
> The use-case I am looking at here is, multiple users connecting to the HiveServer using hive clients and query a table that uses the a custom inputformat. I would not want each user to add the jar executing the "add jar" command before the users start querying the table.
> 
> Is there a way to add extra jars to the hive server once and force the server to push these jars to JT for every MR jobs it generates?
> 
> Appreciate,
> Mitesh