You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Yong-gang Cao <ch...@gmail.com> on 2010/10/22 02:08:57 UTC

is it possible to use self-contained jar in pig?

Hi,
  I met a headache about using UDFs with many dependence, adding them using
register command is very painful and not extensible. I can make
self-contained jar for hadoop job using maven (a jar with a lib directory
which contains all jars will be used for class look up), but it seems
doesn't work for pig. pig just treat that jar as a regular jar and try to
find classes directly inside it instead of inside those embedded jars.
  Is there a way to make pig do the hadoop way of looking into the
self-contained big jar for class loading?
Thanks!

-- 
Regards,

Yong-gang Cao
Seattle,WA,98104

RE: is it possible to use self-contained jar in pig?

Posted by "Kaluskar, Sanjay" <sk...@informatica.com>.
I wrestled with this issue too, and I tried out a few things including
the creating a single top-level jar (containing jars as well as
containing the expanded files). As you found out, the jar with jars
approach doesn't work. The jar with expanded jars approach could work if
you don't have conflicting file names (incl resources) in the
dependencies you are trying to package. It didn't work for me. The
solution that I have isn't not very nice but it works: I have a
top-level jar that included all the dependencies in its manifest (attr
Class-Path). The maven assembly plugin can be used to automate this to
make it extensible & less error-prone. Unfortunately, PIG will not add
all the dependencies to the class path, so you will have to add this
class to the class path by directly editing mapred-site.xml (using dist
cache).

-sanjay

-----Original Message-----
From: Yong-gang Cao [mailto:chiefadminofficer@gmail.com] 
Sent: Friday, October 22, 2010 5:39 AM
To: pig-user@hadoop.apache.org
Subject: is it possible to use self-contained jar in pig?

Hi,
  I met a headache about using UDFs with many dependence, adding them
using register command is very painful and not extensible. I can make
self-contained jar for hadoop job using maven (a jar with a lib
directory which contains all jars will be used for class look up), but
it seems doesn't work for pig. pig just treat that jar as a regular jar
and try to find classes directly inside it instead of inside those
embedded jars.
  Is there a way to make pig do the hadoop way of looking into the
self-contained big jar for class loading?
Thanks!

--
Regards,

Yong-gang Cao
Seattle,WA,98104

Re: is it possible to use self-contained jar in pig?

Posted by Dave Wellman <da...@tynt.com>.
have had some success creating and registering one single jar with dependencies.  downside - the jar can be quite big.  If you use maven to build your jar files just the following to  the plugin section of your pom.xml 


			<plugin>
				<artifactId>maven-assembly-plugin</artifactId>
				<configuration>
					<descriptorRefs>
						<descriptorRef>jar-with-dependencies</descriptorRef>
					</descriptorRefs>
				</configuration>
				<executions>
					<execution>
						<id>make-assembly</id>
						<phase>package</phase>
						<goals>
							<goal>attached</goal>
						</goals>
					</execution>
				</executions>
			</plugin>



On Oct 21, 2010, at 6:08 PM, Yong-gang Cao wrote:

> Hi,
>  I met a headache about using UDFs with many dependence, adding them using
> register command is very painful and not extensible. I can make
> self-contained jar for hadoop job using maven (a jar with a lib directory
> which contains all jars will be used for class look up), but it seems
> doesn't work for pig. pig just treat that jar as a regular jar and try to
> find classes directly inside it instead of inside those embedded jars.
>  Is there a way to make pig do the hadoop way of looking into the
> self-contained big jar for class loading?
> Thanks!
> 
> -- 
> Regards,
> 
> Yong-gang Cao
> Seattle,WA,98104