You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@crunch.apache.org by "Whitacre,Micah" <MI...@CERNER.COM> on 2013/01/23 00:01:19 UTC

Build Inconsistencies with Profiles

For my code I'm maintaining a fork of Crunch simply because I target the CDH4.1.x endstates.  When consuming the latest changes the tests are however failing.  

Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.113 sec <<< FAILURE!
testJoin(org.apache.crunch.lib.join.MultiAvroSchemaJoinIT)  Time elapsed: 1.113 sec  <<< ERROR!
org.apache.crunch.CrunchRuntimeException: java.io.IOException: No files found to materialize at: /var/folders/0f/l_2w0gxd0p15k9410b18j8q40000gp/T/junit6697703972065781019/tmp-crunch.tmp.dir/crunch-1096140411/p1
	at org.apache.crunch.materialize.MaterializableIterable.materialize(MaterializableIterable.java:70)
	at org.apache.crunch.impl.mr.MRPipeline.run(MRPipeline.java:158)
	at org.apache.crunch.materialize.MaterializableIterable.iterator(MaterializableIterable.java:59)
	at com.google.common.collect.Lists.newArrayList(Lists.java:119)
	at org.apache.crunch.lib.join.MultiAvroSchemaJoinIT.testJoin(MultiAvroSchemaJoinIT.java:116)

The attachment are the only changes I've made to project and are just the addition of a profile in the root pom.xml.  To execute I ran "mvn clean install -P cdh4".  Reviewing the git log I don't see any major changes with regard to how materialize or join were changed which seems to be related to the issue.  I think it might be related to CRUNCH-127 as that was the only major change for writing data.

Trying to compare my profile against the two existing I have noticed failures when building one of the profiles.  Specifically if I run "mvn clean install" the build will run and complete successfully.  However if I specify the "hadoop-2" profile it fails with a similar exception.

Executing the following at the root of the git repository.
mvn clean install -P hadoop-2

I get a failure during integration testing of crunch.

Tests run: 3, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 3.52 sec <<< FAILURE!
testMapsideJoin(org.apache.crunch.lib.join.MapsideJoinIT)  Time elapsed: 2.434 sec  <<< ERROR!
org.apache.crunch.CrunchRuntimeException: java.io.IOException: No files found to materialize at: /tmp/junit1288475102834326447/tmp-crunch.tmp.dir/crunch-2036458765/p3
	at org.apache.crunch.materialize.MaterializableIterable.materialize(MaterializableIterable.java:70)
	at org.apache.crunch.impl.mr.MRPipeline.run(MRPipeline.java:158)
	at org.apache.crunch.lib.join.MapsideJoinIT.runMapsideJoin(MapsideJoinIT.java:139)
	at org.apache.crunch.lib.join.MapsideJoinIT.testMapsideJoin(MapsideJoinIT.java:116)

Details about my current setup for building:

Git commit 75ba3546810fc27b4a6248559fefffd6c3eddb44

$ mvn -v
Apache Maven 3.0.4 (r1232337; 2012-01-17 02:44:56-0600)
Maven home: /usr/local/Cellar/maven/3.0.4/libexec
Java version: 1.6.0_37, vendor: Apple Inc.
Java home: /System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home
Default locale: en_US, platform encoding: MacRoman
OS name: "mac os x", version: "10.7.5", arch: "x86_64", family: "mac"

I accept responsibility about maintaining the CDH4 support but I'm curious, am I building the project correctly?  Is this a local setup issue and not something others encounter as well?


CONFIDENTIALITY NOTICE This message and any included attachments are from Cerner Corporation and are intended only for the addressee. The information contained in this message is confidential and may constitute inside or non-public information under international, federal, or state securities laws. Unauthorized forwarding, printing, copying, distribution, or use of such information is strictly prohibited and may be unlawful. If you are not the addressee, please promptly delete this message and notify the sender of the delivery error by e-mail or you may call Cerner's corporate offices in Kansas City, Missouri, U.S.A at (+1) (816)221-1024.

Re: Build Inconsistencies with Profiles

Posted by Josh Wills <jw...@cloudera.com>.
On Wed, Jan 23, 2013 at 5:14 AM, Whitacre,Micah
<MI...@cerner.com>wrote:

>  Thanks.  I no longer get that failure when doing the build with
> "-Dcrunch.platform=2".  I'm still getting the failure on my CDH4 fork and
> will keep looking.  Your change gave me an idea of where to look.
>
>  I still do not get a completely successful build for "hadoop-2":
>
>  <testcase time="10.516"
> classname="org.apache.crunch.io.hbase.WordCountHBaseIT"
> name="testWordCount">
>     <error
> message="org.apache.hadoop.hdfs.MiniDFSCluster.&lt;init&gt;(ILorg/apache/hadoop/conf/Configuration;IZZZLorg/apache/hadoop/hdfs/server/common/HdfsConstants$StartupOption;[Ljava/lang/String;[Ljava/lang/String;[J)V"
> type="java.lang.NoSuchMethodError">java.lang.NoSuchMethodError:
> org.apache.hadoop.hdfs.MiniDFSCluster.&lt;init&gt;(ILorg/apache/hadoop/conf/Configuration;IZZZLorg/apache/hadoop/hdfs/server/common/HdfsConstants$StartupOption;[Ljava/lang/String;[Ljava/lang/String;[J)V
> at
> org.apache.hadoop.hbase.HBaseTestingUtility.startMiniDFSCluster(HBaseTestingUtility.java:430)
> at
> org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:598)
> at
> org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:554)
> at
> org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:523)
> at
> org.apache.crunch.io.hbase.WordCountHBaseIT.setUp(WordCountHBaseIT.java:160)
>
>  This actually seems to indicate a dependency mismatch between hbase and
> hadoop-minicluster.  There is a note in the POM about needing to build
> hbase 0.94.1 from src.  Is that still necessary as a released version is
> available in Maven Central[1].
>

Yes, it is. Like Crunch, HBase builds its maven jars against Hadoop 1.x, so
running the Hadoop 2.x unit tests requires a custom build of hbase from
source. That's one of the things bigtop does when they put together one of
their package releases.


>
>  [1] -
> http://search.maven.org/#artifactdetails%7Corg.apache.hbase%7Chbase%7C0.94.1%7Cjar
>
>  On Jan 22, 2013, at 5:53 PM, Josh Wills wrote:
>
>  I think I have a fix, tracking it at:
> https://issues.apache.org/jira/browse/CRUNCH-148<https://urldefense.proofpoint.com/v1/url?u=https://issues.apache.org/jira/browse/CRUNCH-148&k=PmKqfXspAHNo6iYJ48Q45A%3D%3D%0A&r=MwP8zm6sgnnstbiUpAReMZvSqrZXwpejyuwyb6GLlpU%3D%0A&m=OpOZQK3F8FlJaWn6bDsTi1i0sQeqftvdvxB%2BA5kVfzU%3D%0A&s=90d2b9078d65a5f89f5ed618a2a340fc3db5d1dda006d0e8325309a614f3c3c5>
>
>
>  CONFIDENTIALITY NOTICE This message and any included attachments are from
> Cerner Corporation and are intended only for the addressee. The information
> contained in this message is confidential and may constitute inside or
> non-public information under international, federal, or state securities
> laws. Unauthorized forwarding, printing, copying, distribution, or use of
> such information is strictly prohibited and may be unlawful. If you are not
> the addressee, please promptly delete this message and notify the sender of
> the delivery error by e-mail or you may call Cerner's corporate offices in
> Kansas City, Missouri, U.S.A at (+1) (816)221-1024.
>



-- 
Director of Data Science
Cloudera <http://www.cloudera.com>
Twitter: @josh_wills <http://twitter.com/josh_wills>

Re: Build Inconsistencies with Profiles

Posted by "Whitacre,Micah" <MI...@CERNER.COM>.
Thanks.  I no longer get that failure when doing the build with "-Dcrunch.platform=2".  I'm still getting the failure on my CDH4 fork and will keep looking.  Your change gave me an idea of where to look.

I still do not get a completely successful build for "hadoop-2":

<testcase time="10.516" classname="org.apache.crunch.io.hbase.WordCountHBaseIT" name="testWordCount">
    <error message="org.apache.hadoop.hdfs.MiniDFSCluster.&lt;init&gt;(ILorg/apache/hadoop/conf/Configuration;IZZZLorg/apache/hadoop/hdfs/server/common/HdfsConstants$StartupOption;[Ljava/lang/String;[Ljava/lang/String;[J)V" type="java.lang.NoSuchMethodError">java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.MiniDFSCluster.&lt;init&gt;(ILorg/apache/hadoop/conf/Configuration;IZZZLorg/apache/hadoop/hdfs/server/common/HdfsConstants$StartupOption;[Ljava/lang/String;[Ljava/lang/String;[J)V
at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniDFSCluster(HBaseTestingUtility.java:430)
at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:598)
at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:554)
at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:523)
at org.apache.crunch.io.hbase.WordCountHBaseIT.setUp(WordCountHBaseIT.java:160)

This actually seems to indicate a dependency mismatch between hbase and hadoop-minicluster.  There is a note in the POM about needing to build hbase 0.94.1 from src.  Is that still necessary as a released version is available in Maven Central[1].

[1] - http://search.maven.org/#artifactdetails%7Corg.apache.hbase%7Chbase%7C0.94.1%7Cjar<http://search.maven.org/#artifactdetails|org.apache.hbase|hbase|0.94.1|jar>

On Jan 22, 2013, at 5:53 PM, Josh Wills wrote:

I think I have a fix, tracking it at: https://issues.apache.org/jira/browse/CRUNCH-148<https://urldefense.proofpoint.com/v1/url?u=https://issues.apache.org/jira/browse/CRUNCH-148&k=PmKqfXspAHNo6iYJ48Q45A%3D%3D%0A&r=MwP8zm6sgnnstbiUpAReMZvSqrZXwpejyuwyb6GLlpU%3D%0A&m=OpOZQK3F8FlJaWn6bDsTi1i0sQeqftvdvxB%2BA5kVfzU%3D%0A&s=90d2b9078d65a5f89f5ed618a2a340fc3db5d1dda006d0e8325309a614f3c3c5>


CONFIDENTIALITY NOTICE This message and any included attachments are from Cerner Corporation and are intended only for the addressee. The information contained in this message is confidential and may constitute inside or non-public information under international, federal, or state securities laws. Unauthorized forwarding, printing, copying, distribution, or use of such information is strictly prohibited and may be unlawful. If you are not the addressee, please promptly delete this message and notify the sender of the delivery error by e-mail or you may call Cerner's corporate offices in Kansas City, Missouri, U.S.A at (+1) (816)221-1024.

Re: Build Inconsistencies with Profiles

Posted by Josh Wills <jw...@cloudera.com>.
I think I have a fix, tracking it at:
https://issues.apache.org/jira/browse/CRUNCH-148


On Tue, Jan 22, 2013 at 3:17 PM, Josh Wills <jw...@cloudera.com> wrote:

> Yeah, I'm getting it too. I'll dive in and take a look.
>
> FYI, I usually build with the "-Dcrunch.platform=2" option.
>
>
> On Tue, Jan 22, 2013 at 3:01 PM, Whitacre,Micah <MICAH.WHITACRE@cerner.com
> > wrote:
>
>> For my code I'm maintaining a fork of Crunch simply because I target the
>> CDH4.1.x endstates.  When consuming the latest changes the tests are
>> however failing.
>>
>> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.113 sec
>> <<< FAILURE!
>> testJoin(org.apache.crunch.lib.join.MultiAvroSchemaJoinIT)  Time elapsed:
>> 1.113 sec  <<< ERROR!
>> org.apache.crunch.CrunchRuntimeException: java.io.IOException: No files
>> found to materialize at:
>> /var/folders/0f/l_2w0gxd0p15k9410b18j8q40000gp/T/junit6697703972065781019/tmp-crunch.tmp.dir/crunch-1096140411/p1
>>         at
>> org.apache.crunch.materialize.MaterializableIterable.materialize(MaterializableIterable.java:70)
>>         at org.apache.crunch.impl.mr.MRPipeline.run(MRPipeline.java:158)
>>         at
>> org.apache.crunch.materialize.MaterializableIterable.iterator(MaterializableIterable.java:59)
>>         at com.google.common.collect.Lists.newArrayList(Lists.java:119)
>>         at
>> org.apache.crunch.lib.join.MultiAvroSchemaJoinIT.testJoin(MultiAvroSchemaJoinIT.java:116)
>>
>> The attachment are the only changes I've made to project and are just the
>> addition of a profile in the root pom.xml.  To execute I ran "mvn clean
>> install -P cdh4".  Reviewing the git log I don't see any major changes with
>> regard to how materialize or join were changed which seems to be related to
>> the issue.  I think it might be related to CRUNCH-127 as that was the only
>> major change for writing data.
>>
>> Trying to compare my profile against the two existing I have noticed
>> failures when building one of the profiles.  Specifically if I run "mvn
>> clean install" the build will run and complete successfully.  However if I
>> specify the "hadoop-2" profile it fails with a similar exception.
>>
>> Executing the following at the root of the git repository.
>> mvn clean install -P hadoop-2
>>
>> I get a failure during integration testing of crunch.
>>
>> Tests run: 3, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 3.52 sec
>> <<< FAILURE!
>> testMapsideJoin(org.apache.crunch.lib.join.MapsideJoinIT)  Time elapsed:
>> 2.434 sec  <<< ERROR!
>> org.apache.crunch.CrunchRuntimeException: java.io.IOException: No files
>> found to materialize at:
>> /tmp/junit1288475102834326447/tmp-crunch.tmp.dir/crunch-2036458765/p3
>>         at
>> org.apache.crunch.materialize.MaterializableIterable.materialize(MaterializableIterable.java:70)
>>         at org.apache.crunch.impl.mr.MRPipeline.run(MRPipeline.java:158)
>>         at
>> org.apache.crunch.lib.join.MapsideJoinIT.runMapsideJoin(MapsideJoinIT.java:139)
>>         at
>> org.apache.crunch.lib.join.MapsideJoinIT.testMapsideJoin(MapsideJoinIT.java:116)
>>
>> Details about my current setup for building:
>>
>> Git commit 75ba3546810fc27b4a6248559fefffd6c3eddb44
>>
>> $ mvn -v
>> Apache Maven 3.0.4 (r1232337; 2012-01-17 02:44:56-0600)
>> Maven home: /usr/local/Cellar/maven/3.0.4/libexec
>> Java version: 1.6.0_37, vendor: Apple Inc.
>> Java home:
>> /System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home
>> Default locale: en_US, platform encoding: MacRoman
>> OS name: "mac os x", version: "10.7.5", arch: "x86_64", family: "mac"
>>
>> I accept responsibility about maintaining the CDH4 support but I'm
>> curious, am I building the project correctly?  Is this a local setup issue
>> and not something others encounter as well?
>>
>>
>> CONFIDENTIALITY NOTICE This message and any included attachments are from
>> Cerner Corporation and are intended only for the addressee. The information
>> contained in this message is confidential and may constitute inside or
>> non-public information under international, federal, or state securities
>> laws. Unauthorized forwarding, printing, copying, distribution, or use of
>> such information is strictly prohibited and may be unlawful. If you are not
>> the addressee, please promptly delete this message and notify the sender of
>> the delivery error by e-mail or you may call Cerner's corporate offices in
>> Kansas City, Missouri, U.S.A at (+1) (816)221-1024.
>>
>
>
>
> --
> Director of Data Science
> Cloudera <http://www.cloudera.com>
> Twitter: @josh_wills <http://twitter.com/josh_wills>
>



-- 
Director of Data Science
Cloudera <http://www.cloudera.com>
Twitter: @josh_wills <http://twitter.com/josh_wills>

Re: Build Inconsistencies with Profiles

Posted by Josh Wills <jw...@cloudera.com>.
Yeah, I'm getting it too. I'll dive in and take a look.

FYI, I usually build with the "-Dcrunch.platform=2" option.


On Tue, Jan 22, 2013 at 3:01 PM, Whitacre,Micah
<MI...@cerner.com>wrote:

> For my code I'm maintaining a fork of Crunch simply because I target the
> CDH4.1.x endstates.  When consuming the latest changes the tests are
> however failing.
>
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.113 sec
> <<< FAILURE!
> testJoin(org.apache.crunch.lib.join.MultiAvroSchemaJoinIT)  Time elapsed:
> 1.113 sec  <<< ERROR!
> org.apache.crunch.CrunchRuntimeException: java.io.IOException: No files
> found to materialize at:
> /var/folders/0f/l_2w0gxd0p15k9410b18j8q40000gp/T/junit6697703972065781019/tmp-crunch.tmp.dir/crunch-1096140411/p1
>         at
> org.apache.crunch.materialize.MaterializableIterable.materialize(MaterializableIterable.java:70)
>         at org.apache.crunch.impl.mr.MRPipeline.run(MRPipeline.java:158)
>         at
> org.apache.crunch.materialize.MaterializableIterable.iterator(MaterializableIterable.java:59)
>         at com.google.common.collect.Lists.newArrayList(Lists.java:119)
>         at
> org.apache.crunch.lib.join.MultiAvroSchemaJoinIT.testJoin(MultiAvroSchemaJoinIT.java:116)
>
> The attachment are the only changes I've made to project and are just the
> addition of a profile in the root pom.xml.  To execute I ran "mvn clean
> install -P cdh4".  Reviewing the git log I don't see any major changes with
> regard to how materialize or join were changed which seems to be related to
> the issue.  I think it might be related to CRUNCH-127 as that was the only
> major change for writing data.
>
> Trying to compare my profile against the two existing I have noticed
> failures when building one of the profiles.  Specifically if I run "mvn
> clean install" the build will run and complete successfully.  However if I
> specify the "hadoop-2" profile it fails with a similar exception.
>
> Executing the following at the root of the git repository.
> mvn clean install -P hadoop-2
>
> I get a failure during integration testing of crunch.
>
> Tests run: 3, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 3.52 sec
> <<< FAILURE!
> testMapsideJoin(org.apache.crunch.lib.join.MapsideJoinIT)  Time elapsed:
> 2.434 sec  <<< ERROR!
> org.apache.crunch.CrunchRuntimeException: java.io.IOException: No files
> found to materialize at:
> /tmp/junit1288475102834326447/tmp-crunch.tmp.dir/crunch-2036458765/p3
>         at
> org.apache.crunch.materialize.MaterializableIterable.materialize(MaterializableIterable.java:70)
>         at org.apache.crunch.impl.mr.MRPipeline.run(MRPipeline.java:158)
>         at
> org.apache.crunch.lib.join.MapsideJoinIT.runMapsideJoin(MapsideJoinIT.java:139)
>         at
> org.apache.crunch.lib.join.MapsideJoinIT.testMapsideJoin(MapsideJoinIT.java:116)
>
> Details about my current setup for building:
>
> Git commit 75ba3546810fc27b4a6248559fefffd6c3eddb44
>
> $ mvn -v
> Apache Maven 3.0.4 (r1232337; 2012-01-17 02:44:56-0600)
> Maven home: /usr/local/Cellar/maven/3.0.4/libexec
> Java version: 1.6.0_37, vendor: Apple Inc.
> Java home: /System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home
> Default locale: en_US, platform encoding: MacRoman
> OS name: "mac os x", version: "10.7.5", arch: "x86_64", family: "mac"
>
> I accept responsibility about maintaining the CDH4 support but I'm
> curious, am I building the project correctly?  Is this a local setup issue
> and not something others encounter as well?
>
>
> CONFIDENTIALITY NOTICE This message and any included attachments are from
> Cerner Corporation and are intended only for the addressee. The information
> contained in this message is confidential and may constitute inside or
> non-public information under international, federal, or state securities
> laws. Unauthorized forwarding, printing, copying, distribution, or use of
> such information is strictly prohibited and may be unlawful. If you are not
> the addressee, please promptly delete this message and notify the sender of
> the delivery error by e-mail or you may call Cerner's corporate offices in
> Kansas City, Missouri, U.S.A at (+1) (816)221-1024.
>



-- 
Director of Data Science
Cloudera <http://www.cloudera.com>
Twitter: @josh_wills <http://twitter.com/josh_wills>