You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Carl Steinbach <ca...@cloudera.com> on 2010/02/15 03:12:21 UTC

Re: Hive Installation Error

> I do not know tons about POM/IVY, but I am not sure I am very happy
> with it in this build process. On one hand bundling three versions of
> hadoop core seems wasteful.
>

It's actually four versions of Hadoop (many of which include older versions
of Hive), but who's counting ;)

That time is a little too long for my liking. Likewise I think shim
> building three branches is a good thing, but I think we would be
> better with some type of configure process
>

I did some work on this in HIVE-984. Please comment on this ticket
if you think it makes sense.

Also if ivy/pom is  flaky, I am not sure its 'coolness' factor
> outweighs its practicality. In my experience a failing build is a very
> frustrating thing for some users and others might be too shy to help,
> or people who just walk away and start searching for another
> alternative application.
>

Ivy is not the problem here. In fact one of the goals of Ivy and dependency
managers in general is to provide more than one way of resolving
a dependency. Rather, the problem is that our Ivy configuration is broken.
We call out dependencies on the Hadoop 0.17.2.1, 0.18.3, 0.19.0, and 0.20.0
source tarballs, yet only provide one resolver in ivy/ivysettings.xml that
can actually
satisfy this dependency (i.e. archive.apache.org/dist). If
archive.apache.org is
unreachable our ivysettings configuration causes Ivy to fall back to one of
the
other default resolvers, but this time looking for Hadoop POMs. This will
always
fail since no one has published Hadoop POMs (see HADOOP-6382).

I think we have two options for fixing this problem:

1) Add more Apache mirrors to the list of URL resolvers in ivysettings.xml
    (and fix ivysettings so that it does not look for Hadoop POMs).

2) Temporarily satisfy the shim dependencies by checking the Hadoop JARs
    into the lib directory, and eventually transition back to Ivy when
Hadoop
    POMs become available.

Until Hadoop POMs become available I think (2) is the better option mostly
because (1) is wasteful (we only need the JARs, not all of the source), and
will probably still fail for some subset of users who can't access any of
the
apache mirrors included our configuration.

Thanks.

Carl