You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by Pat Ferrel <pa...@occamsmachete.com> on 2014/10/31 16:55:11 UTC
Build broken

@Gokhan I’m confused too. I think there are too many moving parts. I merged the Spark 1.1 branch into master before trying yours. The master fails to build on the nightly machines so your patch may give a clue but is not required for breaking the build.

@Dmitriy Yes I think I discovered this so I am indeed using the Spark 1.1 from my local maven cache which I built for hadoop 1.2.1. So building Mahout against that should work since all versions match. However I get the same error as the master branch on the build machines. This leads me to question which version of Hadoop is really being used with Gokhan’s patch.  @Gokhan is "-Dhadoop.version=1.2.1” the correct CLI option?

Moving back to the master on my system without this patch, which still uses Spark 1.1 and Hadoop 1.2.1 the build is not broken but does seem to be broken on the build machine. This must be because I am pulling artifacts from my maven cache?

@Andrew cleaned his maven repo cache, unset SPARK_HOME and is getting the failure on the master, again pointing to a problem in the maven repos

Can someone with a broken build try building Spark from source with “mvn clean install ...” Then try building Mahout? @Andrew? If that works we have another clue.

Not sure how to fix this. Is there any reliable way to build using only the maven external repos for Spark or should we switch the build machine to doing Spark builds too?


Oct 31, 2014, at 6:18 AM, Gokhan Capan <no...@github.com> wrote:

Alright, I am confused a little. 

@Pat, doest this patch break the build or it also occurs when packaging the 
master? 

@Dmitriy, 
My rationale is that, as it is suggested at 
http://spark.apache.org/docs/latest/programming-guide.html#linking-with-spark, 
mahout, as a hadoop client, should add hadoop-client and spark_core as a 
dependency and then just work. The version of hadoop-client should support 
multiple hadoop versions since people might have hadoop clusters with 
different versions, from different vendors etc. 



Gokhan 

On Fri, Oct 31, 2014 at 5:20 AM, Dmitriy Lyubimov <no...@github.com> 
wrote: 

> oh. 
> 
> that's hadoop versioning thing then again. 
> 
> keep in mind -- hadoop version in maven dependencies is not the same as 
> actual hadoop version spark has at runtime. 
> 
> in fact, spark modules are pulling hadoop version from spark transitive 
> dependencies as compiled in your local maven. 
> 
> by default nightly will pool the default spark artifact from central, 
> which 
> will have whatever default hadoop version there, which is what most likely 
> spark module is going to use for local tests. but generally it shouldn't 
> matter which hadoop version spark tests are running on, because they are 
> not writing/reading files created with any other version. 
> 
> 
> On Thu, Oct 30, 2014 at 6:23 PM, Pat Ferrel <no...@github.com> 
> wrote: 
> 
> > I'm getting the snappy build test error that broke the nightly build. 
> > Switching back to master and trying again 
> > 
> > — 
> > Reply to this email directly or view it on GitHub 
> > <https://github.com/apache/mahout/pull/54#issuecomment-61200459>. 
> > 
> 
> — 
> Reply to this email directly or view it on GitHub 
> <https://github.com/apache/mahout/pull/54#issuecomment-61210738>. 
>
—
Reply to this email directly or view it on GitHub <https://github.com/apache/mahout/pull/54#issuecomment-61258378>.