You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Niels Basjes <Ni...@basjes.nl> on 2013/04/16 22:44:12 UTC

Classpath issues with a custom loadfunc

Hi,

I'm attempting to build a custom LoadFunc for pig and I'm running into a
rather silly issue.

My project has several dependencies and I've been trying to create a single
jar that contains all of these dependencies (I'm using the maven assembly
plugin).

Now if I manually place all dependencies in my pig script using a list of
"REGISTER foo.jar" statements it all works.
What I want is to have a single jar with everything in it so that I only
need a single jar in the pig code.

After some playing around I found that it I unpack all the dependencies
inside my jar and unpack them in the root of the jar it all seems to work
... and turns into a big mess with warnings about "[INFO]
META-INF/MANIFEST.MF already added, skipping".

What is the clean way of doing this and still let PIG find the dependencies
inside the jar?
Can anyone point me towards a working example?

Thanks.

-- 
Best regards / Met vriendelijke groeten,

Niels Basjes

Re: Classpath issues with a custom loadfunc

Posted by Ruslan Al-Fakikh <me...@gmail.com>.
He Niels,

This is not a Pig question, it is more of a Java packaging question. What
exactly went wrong with the maven assembly plugin? Maybe the maven shade
plugin would work better? (though I've never tried it myself)
For me - the simplest way is to just register all the needed dependencies
and I've seen such approach in some hadoop docs. Also note that some of the
dependencies can already be present in pig/hadoop environment, for example
JodaTime. So you don't have to register them. But make sure you are
compliant with the versions of those dependencies.
Hope that helps, let me know if you have questions.

Best Regards,
Ruslan


On Wed, Apr 17, 2013 at 1:40 AM, Prashant Kommireddi <pr...@gmail.com>wrote:

> Hi Niels,
>
> Pig supports globbing in REGISTER statements. You could place all the jars
> in a certain location and have them all be picked up.
>
> REGISTER *.jar
>
>
> On Tue, Apr 16, 2013 at 1:44 PM, Niels Basjes <Ni...@basjes.nl> wrote:
>
> > Hi,
> >
> > I'm attempting to build a custom LoadFunc for pig and I'm running into a
> > rather silly issue.
> >
> > My project has several dependencies and I've been trying to create a
> single
> > jar that contains all of these dependencies (I'm using the maven assembly
> > plugin).
> >
> > Now if I manually place all dependencies in my pig script using a list of
> > "REGISTER foo.jar" statements it all works.
> > What I want is to have a single jar with everything in it so that I only
> > need a single jar in the pig code.
> >
> > After some playing around I found that it I unpack all the dependencies
> > inside my jar and unpack them in the root of the jar it all seems to work
> > ... and turns into a big mess with warnings about "[INFO]
> > META-INF/MANIFEST.MF already added, skipping".
> >
> > What is the clean way of doing this and still let PIG find the
> dependencies
> > inside the jar?
> > Can anyone point me towards a working example?
> >
> > Thanks.
> >
> > --
> > Best regards / Met vriendelijke groeten,
> >
> > Niels Basjes
> >
>

Re: Classpath issues with a custom loadfunc

Posted by Prashant Kommireddi <pr...@gmail.com>.
Hi Niels,

Pig supports globbing in REGISTER statements. You could place all the jars
in a certain location and have them all be picked up.

REGISTER *.jar


On Tue, Apr 16, 2013 at 1:44 PM, Niels Basjes <Ni...@basjes.nl> wrote:

> Hi,
>
> I'm attempting to build a custom LoadFunc for pig and I'm running into a
> rather silly issue.
>
> My project has several dependencies and I've been trying to create a single
> jar that contains all of these dependencies (I'm using the maven assembly
> plugin).
>
> Now if I manually place all dependencies in my pig script using a list of
> "REGISTER foo.jar" statements it all works.
> What I want is to have a single jar with everything in it so that I only
> need a single jar in the pig code.
>
> After some playing around I found that it I unpack all the dependencies
> inside my jar and unpack them in the root of the jar it all seems to work
> ... and turns into a big mess with warnings about "[INFO]
> META-INF/MANIFEST.MF already added, skipping".
>
> What is the clean way of doing this and still let PIG find the dependencies
> inside the jar?
> Can anyone point me towards a working example?
>
> Thanks.
>
> --
> Best regards / Met vriendelijke groeten,
>
> Niels Basjes
>