You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by Stefan Groschupf <sg...@101tec.com> on 2008/02/13 20:12:37 UTC
hadoop15 & hadoop14 both in lib
Hi,
sorry for the traffic.
Why is there a hadoop14 and a hadoop15 jar in lib?
Wouldn't be one enough? As far I understand the code 15 is required
since generics are used.
Thanks for any clarification.
Stefan
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
101tec Inc.
Menlo Park, California, USA
http://www.101tec.com
Re: hadoop15 & hadoop14 both in lib
Posted by Stefan Groschupf <sg...@101tec.com>.
On Feb 19, 2008, at 3:54 AM, Craig Macdonald wrote:
>> As mentioned earlier we need to write a pig shell script anyhow
>> from my point of view.
> What would the shell script do that the Perl script pig.pl couldn't
> be made to do?
Ups, I never payed attention to the perl script. I guess nothing,
except remove the perl dependency but introduce a dependency to a
shell (cygwin on windows). But since hadoop require cygwin on windows
already anyway, I guess I would prefer that.
Re: hadoop15 & hadoop14 both in lib
Posted by Craig Macdonald <cr...@dcs.gla.ac.uk>.
Stefan Groschupf wrote:
> On Feb 16, 2008, at 1:56 AM, Andrzej Bialecki wrote:
>> If you want to keep hadoop-related jars separate from other jars, you
>> could put them all together in a lib/hadoop subdir.
> +1, I like that idea.
> As mentioned earlier we need to write a pig shell script anyhow from
> my point of view.
What would the shell script do that the Perl script pig.pl couldn't be
made to do?
C
Re: hadoop15 & hadoop14 both in lib
Posted by Stefan Groschupf <sg...@101tec.com>.
On Feb 16, 2008, at 1:56 AM, Andrzej Bialecki wrote:
> If you want to keep hadoop-related jars separate from other jars,
> you could put them all together in a lib/hadoop subdir.
+1, I like that idea.
As mentioned earlier we need to write a pig shell script anyhow from
my point of view.
Re: hadoop15 & hadoop14 both in lib
Posted by Andrzej Bialecki <ab...@getopt.org>.
Alan Gates wrote:
> A few answers to your questions.
>
> The hadoopX.jar files in pig's lib directory are not the standard hadoop
> jars. They differ in two ways. First, we recreate a hadoop jar that
> rolls in all the jars needed to compile with hadoop. This is somewhere
> around 15 jars. Second, we have a small hack we add for historical
> reasons. We need to resolve both of those issues. Once we do we can
> use stock hadoop jars instead of carrying along our own.
If you want to keep hadoop-related jars separate from other jars, you
could put them all together in a lib/hadoop subdir. Re-packaging jars is
confusing, you lose versioning information of dependent jars and also
some jars may depend on specific values in MANIFEST, which repackaging
may have dropped.
Regarding the hack: we had similar problems in Nutch. If changes are
required to core Hadoop, perhaps it's better to submit them to Hadoop
for inclusion. If they are a temporary hack, perhaps a facade class is a
better approach. In some cases in Nutch we had to used a patched library
anyway, which was then clearly marked as such and diffs from the stock
version were available in JIRA.
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com
RE: hadoop15 & hadoop14 both in lib
Posted by Olga Natkovich <ol...@yahoo-inc.com>.
I am working on integrating 0.16 and will remove 0.14 jar as part of that update.
Olga
> -----Original Message-----
> From: Alan Gates [mailto:gates@yahoo-inc.com]
> Sent: Friday, February 15, 2008 4:02 PM
> To: pig-dev@incubator.apache.org
> Subject: Re: hadoop15 & hadoop14 both in lib
>
> A few answers to your questions.
>
> The hadoopX.jar files in pig's lib directory are not the
> standard hadoop jars. They differ in two ways. First, we
> recreate a hadoop jar that rolls in all the jars needed to
> compile with hadoop. This is somewhere around 15 jars.
> Second, we have a small hack we add for historical reasons.
> We need to resolve both of those issues. Once we do we can
> use stock hadoop jars instead of carrying along our own.
>
> The reason for having multiple versions is to support
> compilation against multiple versions of hadoop. I'm not
> sure what the use of
> hadoop14 is since we can't compile against it anymore. But
> once we've tested a build against hadoop16 (coming soon),
> we'll need a library for that. And we will be able to build
> against either 15 or 16.
>
> Alan.
>
> Benjamin Francisoud wrote:
> > Stefan Groschupf a écrit :
> >>> Also, I think the Pig project should follow the common
> practice and
> >>> NOT rename third-party libraries, i.e. in this case to keep the
> >>> original name of hadoop-0.15.0.jar (if indeed it was that Hadoop
> >>> release).
> >>
> >> 100 % agreed. What would be great!
> >> What would be a perfect solution would be using
> >> http://ant.apache.org/ivy/.
> >> However as I understand it this required that the hadoop
> developers
> >> publish there releases into a repository.
> >> However not sure if hadoop developers are willing to do that. It
> >> would help quite a lot for many other projects as well.
> >>
> >> Stefan
> > Even if I think ivy is great pig has so few libs (4 if I exclude
> > hadoop14) that I think a "classical" lib folder holding jars (with
> > version numbers in the jars name) could be enough...
> >
> > http://svn.apache.org/repos/asf/incubator/pig/trunk/lib/
> >
> > my 2 cents
> >
>
Re: hadoop15 & hadoop14 both in lib
Posted by Alan Gates <ga...@yahoo-inc.com>.
A few answers to your questions.
The hadoopX.jar files in pig's lib directory are not the standard hadoop
jars. They differ in two ways. First, we recreate a hadoop jar that
rolls in all the jars needed to compile with hadoop. This is somewhere
around 15 jars. Second, we have a small hack we add for historical
reasons. We need to resolve both of those issues. Once we do we can
use stock hadoop jars instead of carrying along our own.
The reason for having multiple versions is to support compilation
against multiple versions of hadoop. I'm not sure what the use of
hadoop14 is since we can't compile against it anymore. But once we've
tested a build against hadoop16 (coming soon), we'll need a library for
that. And we will be able to build against either 15 or 16.
Alan.
Benjamin Francisoud wrote:
> Stefan Groschupf a écrit :
>>> Also, I think the Pig project should follow the common practice and
>>> NOT rename third-party libraries, i.e. in this case to keep the
>>> original name of hadoop-0.15.0.jar (if indeed it was that Hadoop
>>> release).
>>
>> 100 % agreed. What would be great!
>> What would be a perfect solution would be using
>> http://ant.apache.org/ivy/.
>> However as I understand it this required that the hadoop developers
>> publish there releases into a repository.
>> However not sure if hadoop developers are willing to do that. It
>> would help quite a lot for many other projects as well.
>>
>> Stefan
> Even if I think ivy is great pig has so few libs (4 if I exclude
> hadoop14) that I think a "classical" lib folder holding jars (with
> version numbers in the jars name) could be enough...
>
> http://svn.apache.org/repos/asf/incubator/pig/trunk/lib/
>
> my 2 cents
>
Re: hadoop15 & hadoop14 both in lib
Posted by Benjamin Francisoud <be...@joost.com>.
Stefan Groschupf a écrit :
>> Also, I think the Pig project should follow the common practice and
>> NOT rename third-party libraries, i.e. in this case to keep the
>> original name of hadoop-0.15.0.jar (if indeed it was that Hadoop
>> release).
>
> 100 % agreed. What would be great!
> What would be a perfect solution would be using
> http://ant.apache.org/ivy/.
> However as I understand it this required that the hadoop developers
> publish there releases into a repository.
> However not sure if hadoop developers are willing to do that. It would
> help quite a lot for many other projects as well.
>
> Stefan
Even if I think ivy is great pig has so few libs (4 if I exclude
hadoop14) that I think a "classical" lib folder holding jars (with
version numbers in the jars name) could be enough...
http://svn.apache.org/repos/asf/incubator/pig/trunk/lib/
my 2 cents
Re: hadoop15 & hadoop14 both in lib
Posted by Stefan Groschupf <sg...@101tec.com>.
> Also, I think the Pig project should follow the common practice and
> NOT rename third-party libraries, i.e. in this case to keep the
> original name of hadoop-0.15.0.jar (if indeed it was that Hadoop
> release).
100 % agreed. What would be great!
What would be a perfect solution would be using http://ant.apache.org/ivy/
.
However as I understand it this required that the hadoop developers
publish there releases into a repository.
However not sure if hadoop developers are willing to do that. It would
help quite a lot for many other projects as well.
Stefan
Re: hadoop15 & hadoop14 both in lib
Posted by Andrzej Bialecki <ab...@getopt.org>.
Benjamin Francisoud wrote:
> Stefan Groschupf a écrit :
>> Hi,
>> sorry for the traffic.
>> Why is there a hadoop14 and a hadoop15 jar in lib?
>> Wouldn't be one enough? As far I understand the code 15 is required
>> since generics are used.
>> Thanks for any clarification.
>> Stefan
> The build.xml exclude hadoop14 from classpath when building...
>
> As a user, I don't need 14 backward compatibility, our production
> cluster is using 15.
Also, I think the Pig project should follow the common practice and NOT
rename third-party libraries, i.e. in this case to keep the original
name of hadoop-0.15.0.jar (if indeed it was that Hadoop release).
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com
Re: hadoop15 & hadoop14 both in lib
Posted by Stefan Groschupf <sg...@101tec.com>.
> The build.xml exclude hadoop14 from classpath when building...
So can we remove hadoop 14 from the project than? Can one of the
contributors please do that?
Thanks.
Stefan
Re: hadoop15 & hadoop14 both in lib
Posted by Benjamin Francisoud <be...@joost.com>.
Stefan Groschupf a écrit :
> Hi,
> sorry for the traffic.
> Why is there a hadoop14 and a hadoop15 jar in lib?
> Wouldn't be one enough? As far I understand the code 15 is required
> since generics are used.
> Thanks for any clarification.
> Stefan
The build.xml exclude hadoop14 from classpath when building...
As a user, I don't need 14 backward compatibility, our production
cluster is using 15.