You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Stephen Boesch <ja...@gmail.com> on 2014/07/18 01:00:26 UTC
Current way to include hive in a build
Having looked at trunk make-distribution.sh the --with-hive and --with-yarn
are now deprecated.
Here is the way I have built it:
Added to pom.xml:
<profile>
<id>cdh5</id>
<activation>
<activeByDefault>false</activeByDefault>
</activation>
<properties>
<hadoop.version>2.3.0-cdh5.0.0</hadoop.version>
<yarn.version>2.3.0-cdh5.0.0</yarn.version>
<hbase.version>0.96.1.1-cdh5.0.0</hbase.version>
<zookeeper.version>3.4.5-cdh5.0.0</zookeeper.version>
</properties>
</profile>
*mvn -Pyarn -Pcdh5 -Phive -Dhadoop.version=2.3.0-cdh5.0.1
-Dyarn.version=2.3.0-cdh5.0.0 -DskipTests clean package*
[INFO]
------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] Spark Project Parent POM .......................... SUCCESS [3.165s]
[INFO] Spark Project Core ................................ SUCCESS
[2:39.504s]
[INFO] Spark Project Bagel ............................... SUCCESS [7.596s]
[INFO] Spark Project GraphX .............................. SUCCESS [22.027s]
[INFO] Spark Project ML Library .......................... SUCCESS [36.284s]
[INFO] Spark Project Streaming ........................... SUCCESS [24.309s]
[INFO] Spark Project Tools ............................... SUCCESS [3.147s]
[INFO] Spark Project Catalyst ............................ SUCCESS [20.148s]
[INFO] Spark Project SQL ................................. SUCCESS [18.560s]
*[INFO] Spark Project Hive ................................ FAILURE
[33.962s]*
[ERROR] Failed to execute goal
org.apache.maven.plugins:maven-dependency-plugin:2.4:copy-dependencies
(copy-dependencies) on project spark-hive_2.10: Execution copy-dependencies
of goal
org.apache.maven.plugins:maven-dependency-plugin:2.4:copy-dependencies
failed: Plugin org.apache.maven.plugins:maven-dependency-plugin:2.4 or one
of its dependencies could not be resolved: Could not find artifact
commons-logging:commons-logging:jar:1.0.4 -> [Help 1]
Anyone who is presently building with -Phive and has a suggestion for this?
Re: Current way to include hive in a build
Posted by Stephen Boesch <ja...@gmail.com>.
Thanks v much Patrick and Sean. I have the build working now as follows:
mvn -Pyarn -Pcdh5 -Phive -DskipTests clean package
in Addition, I am in the midst of running some tests and so far so good.
The pom.xml changes:
Added to main/parent directory pom.xml:
<profile>
<id>cdh5</id>
<properties>
<hadoop.version>2.3.0-cdh5.0.0</hadoop.version>
<yarn.version>2.3.0-cdh5.0.0</yarn.version>
<zookeeper.version>3.4.5-cdh5.0.0</zookeeper.version>
<protobuf.version>2.5.0</protobuf.version>
<jets3t.version>0.9.0</jets3t.version>
<hbase.version>0.96.1.1-cdh5.0.0</hbase.version>
</properties>
</profile>
Added four dependencies into *examples/*pom.xml:
One each for : (hbase-common, hbase-client, hbase-protocol, hbase-server).
Here is the one for hbase-common:
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-common</artifactId>
<version>${hbase.version}</version>
<exclusions>
<exclusion>
<groupId>asm</groupId>
<artifactId>asm</artifactId>
</exclusion>
<exclusion>
<groupId>org.jboss.netty</groupId>
<artifactId>netty</artifactId>
</exclusion>
<exclusion>
<groupId>io.netty</groupId>
<artifactId>netty</artifactId>
</exclusion>
<exclusion>
<groupId>commons-logging</groupId>
<artifactId>commons-logging</artifactId>
</exclusion>
<exclusion>
<groupId>org.jruby</groupId>
<artifactId>jruby-complete</artifactId>
</exclusion>
</exclusions>
</dependency>
Duplicate the above for:
<artifactId>hbase-client</artifactId>
..
<artifactId>hbase-protocol</artifactId>
..
<artifactId>hbase-server</artifactId>
..
2014-07-18 3:16 GMT-07:00 Sean Owen <so...@cloudera.com>:
> This build invocation works just as you have it, for me. (At least, it
> gets through Hive; Examples fails for a different unrelated reason.)
>
> commons-logging 1.0.4 exists in Maven for sure. Maybe there is some
> temporary problem accessing Maven's repo?
>
> On Fri, Jul 18, 2014 at 12:00 AM, Stephen Boesch <ja...@gmail.com>
> wrote:
> > Added to pom.xml:
> >
> > <profile>
> > <id>cdh5</id>
> > <activation>
> > <activeByDefault>false</activeByDefault>
> > </activation>
> > <properties>
> > <hadoop.version>2.3.0-cdh5.0.0</hadoop.version>
> > <yarn.version>2.3.0-cdh5.0.0</yarn.version>
> > <hbase.version>0.96.1.1-cdh5.0.0</hbase.version>
> > <zookeeper.version>3.4.5-cdh5.0.0</zookeeper.version>
> > </properties>
> > </profile>
> >
> > *mvn -Pyarn -Pcdh5 -Phive -Dhadoop.version=2.3.0-cdh5.0.1
> > -Dyarn.version=2.3.0-cdh5.0.0 -DskipTests clean package*
> >
> >
> > [INFO]
> > ------------------------------------------------------------------------
> > [INFO] Reactor Summary:
> > [INFO]
> > [INFO] Spark Project Parent POM .......................... SUCCESS
> [3.165s]
> > [INFO] Spark Project Core ................................ SUCCESS
> > [2:39.504s]
> > [INFO] Spark Project Bagel ............................... SUCCESS
> [7.596s]
> > [INFO] Spark Project GraphX .............................. SUCCESS
> [22.027s]
> > [INFO] Spark Project ML Library .......................... SUCCESS
> [36.284s]
> > [INFO] Spark Project Streaming ........................... SUCCESS
> [24.309s]
> > [INFO] Spark Project Tools ............................... SUCCESS
> [3.147s]
> > [INFO] Spark Project Catalyst ............................ SUCCESS
> [20.148s]
> > [INFO] Spark Project SQL ................................. SUCCESS
> [18.560s]
> > *[INFO] Spark Project Hive ................................ FAILURE
> > [33.962s]*
> >
> > [ERROR] Failed to execute goal
> > org.apache.maven.plugins:maven-dependency-plugin:2.4:copy-dependencies
> > (copy-dependencies) on project spark-hive_2.10: Execution
> copy-dependencies
> > of goal
> > org.apache.maven.plugins:maven-dependency-plugin:2.4:copy-dependencies
> > failed: Plugin org.apache.maven.plugins:maven-dependency-plugin:2.4 or
> one
> > of its dependencies could not be resolved: Could not find artifact
> > commons-logging:commons-logging:jar:1.0.4 -> [Help 1]
> >
> > Anyone who is presently building with -Phive and has a suggestion for
> this?
>
Re: Current way to include hive in a build
Posted by Sean Owen <so...@cloudera.com>.
This build invocation works just as you have it, for me. (At least, it
gets through Hive; Examples fails for a different unrelated reason.)
commons-logging 1.0.4 exists in Maven for sure. Maybe there is some
temporary problem accessing Maven's repo?
On Fri, Jul 18, 2014 at 12:00 AM, Stephen Boesch <ja...@gmail.com> wrote:
> Added to pom.xml:
>
> <profile>
> <id>cdh5</id>
> <activation>
> <activeByDefault>false</activeByDefault>
> </activation>
> <properties>
> <hadoop.version>2.3.0-cdh5.0.0</hadoop.version>
> <yarn.version>2.3.0-cdh5.0.0</yarn.version>
> <hbase.version>0.96.1.1-cdh5.0.0</hbase.version>
> <zookeeper.version>3.4.5-cdh5.0.0</zookeeper.version>
> </properties>
> </profile>
>
> *mvn -Pyarn -Pcdh5 -Phive -Dhadoop.version=2.3.0-cdh5.0.1
> -Dyarn.version=2.3.0-cdh5.0.0 -DskipTests clean package*
>
>
> [INFO]
> ------------------------------------------------------------------------
> [INFO] Reactor Summary:
> [INFO]
> [INFO] Spark Project Parent POM .......................... SUCCESS [3.165s]
> [INFO] Spark Project Core ................................ SUCCESS
> [2:39.504s]
> [INFO] Spark Project Bagel ............................... SUCCESS [7.596s]
> [INFO] Spark Project GraphX .............................. SUCCESS [22.027s]
> [INFO] Spark Project ML Library .......................... SUCCESS [36.284s]
> [INFO] Spark Project Streaming ........................... SUCCESS [24.309s]
> [INFO] Spark Project Tools ............................... SUCCESS [3.147s]
> [INFO] Spark Project Catalyst ............................ SUCCESS [20.148s]
> [INFO] Spark Project SQL ................................. SUCCESS [18.560s]
> *[INFO] Spark Project Hive ................................ FAILURE
> [33.962s]*
>
> [ERROR] Failed to execute goal
> org.apache.maven.plugins:maven-dependency-plugin:2.4:copy-dependencies
> (copy-dependencies) on project spark-hive_2.10: Execution copy-dependencies
> of goal
> org.apache.maven.plugins:maven-dependency-plugin:2.4:copy-dependencies
> failed: Plugin org.apache.maven.plugins:maven-dependency-plugin:2.4 or one
> of its dependencies could not be resolved: Could not find artifact
> commons-logging:commons-logging:jar:1.0.4 -> [Help 1]
>
> Anyone who is presently building with -Phive and has a suggestion for this?
Re: Current way to include hive in a build
Posted by Patrick Wendell <pw...@gmail.com>.
Hey Stephen,
The only change the build was that we ask users to run -Phive and
-Pyarn of --with-hive and --with-yarn (which internally just set
-Phive and -Pyarn). I don't think this should affect the dependency
graph.
Just to test this, what happens if you run *without* the CDH profile
and build with hadoop version 2.3.0? Does that work?
- Patrick
On Thu, Jul 17, 2014 at 4:00 PM, Stephen Boesch <ja...@gmail.com> wrote:
> Having looked at trunk make-distribution.sh the --with-hive and --with-yarn
> are now deprecated.
>
> Here is the way I have built it:
>
> Added to pom.xml:
>
> <profile>
> <id>cdh5</id>
> <activation>
> <activeByDefault>false</activeByDefault>
> </activation>
> <properties>
> <hadoop.version>2.3.0-cdh5.0.0</hadoop.version>
> <yarn.version>2.3.0-cdh5.0.0</yarn.version>
> <hbase.version>0.96.1.1-cdh5.0.0</hbase.version>
> <zookeeper.version>3.4.5-cdh5.0.0</zookeeper.version>
> </properties>
> </profile>
>
> *mvn -Pyarn -Pcdh5 -Phive -Dhadoop.version=2.3.0-cdh5.0.1
> -Dyarn.version=2.3.0-cdh5.0.0 -DskipTests clean package*
>
>
> [INFO]
> ------------------------------------------------------------------------
> [INFO] Reactor Summary:
> [INFO]
> [INFO] Spark Project Parent POM .......................... SUCCESS [3.165s]
> [INFO] Spark Project Core ................................ SUCCESS
> [2:39.504s]
> [INFO] Spark Project Bagel ............................... SUCCESS [7.596s]
> [INFO] Spark Project GraphX .............................. SUCCESS [22.027s]
> [INFO] Spark Project ML Library .......................... SUCCESS [36.284s]
> [INFO] Spark Project Streaming ........................... SUCCESS [24.309s]
> [INFO] Spark Project Tools ............................... SUCCESS [3.147s]
> [INFO] Spark Project Catalyst ............................ SUCCESS [20.148s]
> [INFO] Spark Project SQL ................................. SUCCESS [18.560s]
> *[INFO] Spark Project Hive ................................ FAILURE
> [33.962s]*
>
> [ERROR] Failed to execute goal
> org.apache.maven.plugins:maven-dependency-plugin:2.4:copy-dependencies
> (copy-dependencies) on project spark-hive_2.10: Execution copy-dependencies
> of goal
> org.apache.maven.plugins:maven-dependency-plugin:2.4:copy-dependencies
> failed: Plugin org.apache.maven.plugins:maven-dependency-plugin:2.4 or one
> of its dependencies could not be resolved: Could not find artifact
> commons-logging:commons-logging:jar:1.0.4 -> [Help 1]
>
> Anyone who is presently building with -Phive and has a suggestion for this?