You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Mark Kerzner <ma...@gmail.com> on 2011/02/25 14:06:17 UTC

Packaging for Hadoop - what about the Hadoop libraries?

Hi,

when packaging additional libraries for an MR job, I can use a script or a
Maven Hadoop plugin, but what about the Hadoop libraries themselves? Should
I package them in, or should I rely on those jars that are already present
in the Hadoop installation where the code will be running? What is the best
practice?

Thank you,
Mark

Re: Packaging for Hadoop - what about the Hadoop libraries?

Posted by pr...@nokia.com.

Just package the libraries that you MR jobs are dependent on. No need to package hadoop libraries. But make sure you hadoop client version matches with server version.

Praveen

On Feb 25, 2011, at 8:07 AM, "ext Mark Kerzner" <ma...@gmail.com> wrote:

> Hi,
> 
> when packaging additional libraries for an MR job, I can use a script or a
> Maven Hadoop plugin, but what about the Hadoop libraries themselves? Should
> I package them in, or should I rely on those jars that are already present
> in the Hadoop installation where the code will be running? What is the best
> practice?
> 
> Thank you,
> Mark

Re: Packaging for Hadoop - what about the Hadoop libraries?

Posted by James Seigel <ja...@tynt.com>.

The ones that are present.

It is a little tricky for the other ones however, well not really once you “get it”

-libjars <list of supporting jars> on the commandline will ship the “supporting” jars out with the job to the map reducers, however if you, for some reason need them in the job submission they won’t be present,  you either need to have those in the command line classpath or bundled.

Cheers
James.


On 2011-02-25, at 6:06 AM, Mark Kerzner wrote:

> Hi,
> 
> when packaging additional libraries for an MR job, I can use a script or a
> Maven Hadoop plugin, but what about the Hadoop libraries themselves? Should
> I package them in, or should I rely on those jars that are already present
> in the Hadoop installation where the code will be running? What is the best
> practice?
> 
> Thank you,
> Mark