You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@crunch.apache.org by Deepak Vohra <dv...@yahoo.com> on 2013/01/10 21:07:40 UTC

CRUNCH-140

The command/s to test bug
https://issues.apache.org/jira/browse/CRUNCH-140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549900#comment-13549900

>cd /usr/local/apache-crunch-0.4.0-incubating-bin
>cp LICENSE LICENSE.txt
>sudo mkdir crunch

>hadoop jar crunch-examples-0.4.0-incubating-job.jar org.apache.crunch.examples.WordCount
LICENSE.txt crunch  

Re: CRUNCH-140

Posted by Deepak Vohra <dv...@yahoo.com>.
Yes. The input is required to be on the HDFS.

But the Crunch job still fails with the following output.
 
13/01/11
00:10:45 INFO collect.PGroupedTableImpl: Setting num reduce tasks to
1 
13/01/11
00:10:48 INFO input.FileInputFormat: Total input paths to process :
1 
13/01/11
00:10:48 INFO exec.CrunchJob: Running job
"org.apache.crunch.examples.WordCount:
Text(LICENSE.txt)+S0+Aggregate.count+GBK+combine+asText+Text(crunch)" 
13/01/11
00:10:48 INFO exec.CrunchJob: Job status available at:
http://ec2-50-19-55-40.compute-1.amazonaws.com:50030/jobdetails.jsp?jobid=job_201301102312_0002 
1
job failure(s) occurred: 
org.apache.crunch.examples.WordCount:
Text(LICENSE.txt)+S0+Aggregate.count+GBK+combine+asText+Text(crunch)(class
org.apache.crunch.examples.WordCount0): Job failed! 


________________________________
 From: Josh Wills <jo...@gmail.com>
To: crunch-user@incubator.apache.org; Deepak Vohra <dv...@yahoo.com> 
Sent: Thursday, January 10, 2013 1:26:35 PM
Subject: Re: CRUNCH-140
 

Does running:

hadoop fs -put LICENSE.txt .

followed by hadoop jar ... fix it?



On Thu, Jan 10, 2013 at 12:07 PM, Deepak Vohra <dv...@yahoo.com> wrote:

The command/s to test bug
>https://issues.apache.org/jira/browse/CRUNCH-140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549900#comment-13549900
>
>>cd /usr/local/apache-crunch-0.4.0-incubating-bin
>>cp LICENSE LICENSE.txt
>>sudo mkdir crunch
>
>>hadoop jar crunch-examples-0.4.0-incubating-job.jar org.apache.crunch.examples.WordCount
>LICENSE.txt crunch  

Re: CRUNCH-140

Posted by Deepak Vohra <dv...@yahoo.com>.

The statement:

"The library is not compatible with versions of Hadoop prior to 1.0.x or 2.0.x,
such as version 0.20.x."

does mention the Hadoop version requirement, but should be modified to:

The library is not compatible with versions of Hadoop prior to 1.0.1 or 2.0.x,
such as version 1.0.0, 0.20.x.



________________________________
 From: Josh Wills <jw...@cloudera.com>
To: Deepak Vohra <dv...@yahoo.com> 
Cc: "crunch-user@incubator.apache.org" <cr...@incubator.apache.org> 
Sent: Friday, January 11, 2013 10:18:43 AM
Subject: Re: CRUNCH-140
 

Okay-- in that kind of environment, it's best to turn speculative execution off.



On Fri, Jan 11, 2013 at 10:17 AM, Deepak Vohra <dv...@yahoo.com> wrote:


>
>Yes. Single node  including a HBase cluster. Speculative execution is on by default, haven't set it to false.
>
>
>________________________________
> From: Josh Wills <jw...@cloudera.com>
>
>To: crunch-user@incubator.apache.org; Deepak Vohra <dv...@yahoo.com> 
>Sent: Friday, January 11, 2013 9:43:34 AM
>Subject: Re: CRUNCH-140
> 
>
>
>Are you running on a single node w/speculative execution turned on?
>
>
>
>On Fri, Jan 11, 2013 at 9:29 AM, Deepak Vohra <dv...@yahoo.com> wrote:
>
>
>>
>>Hadoop 1.0.0, which supports multiple output, also generates the same error:
>>
>>
>>2013-01-11 17:21:20,368 INFO org.apache.crunch.impl.mr.run.RTNode: Crunch exception in 'Text(crunch)' for input: [,61]
>>org.apache.crunch.impl.mr.run.CrunchRuntimeException: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /tmp/crunch-140947123/p1/output/_temporary/_attempt_201301111630_0001_r_000000_3/part-r-00000 for DFSClient_attempt_201301111630_0001_r_000000_3 on client 10.254.110.80 because current leaseholder is trying to recreate file.
>>
>>
>>
>>________________________________
>> From: Josh Wills <jo...@gmail.com>
>>To: Deepak Vohra <dv...@yahoo.com> 
>>Cc: "crunch-user@incubator.apache.org" <cr...@incubator.apache.org> 
>>Sent: Thursday, January 10, 2013 5:13:17 PM
>>Subject: Re: CRUNCH-140
>> 
>>
>>
>>Most likely, yes.
>>
>>
>>
>>On Thu, Jan 10, 2013 at 5:12 PM, Deepak Vohra <dv...@yahoo.com> wrote:
>>
>>
>>>
>>>Would the HBase bug or error CRUNCH-141 be also because of the Hadoop version being pre-1.0.x?
>>>
>>>
>>>
>>>
>>>
>>>________________________________
>>> From: Josh Wills <jo...@gmail.com>
>>>To: Deepak Vohra <dv...@yahoo.com> 
>>>Cc: "crunch-user@incubator.apache.org" <cr...@incubator.apache.org> 
>>>Sent: Thursday, January 10, 2013 5:10:04 PM
>>>Subject: Re: CRUNCH-140
>>> 
>>>
>>>
>>>Hrm-- I think that's b/c you're using an earlier version of Hadoop (pre-1.0.x) that doesn't support multiple outputs, which Crunch relies on.
>>>
>>>
>>>
>>>On Thu, Jan 10, 2013 at 5:08 PM, Deepak Vohra <dv...@yahoo.com> wrote:
>>>
>>>From userlogs:
>>>>
>>>>
>>>>WARN
org.apache.hadoop.mapred.Child: Error running child 
>>>>org.apache.crunch.impl.mr.run.CrunchRuntimeException:
org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed
to create file
/tmp/crunch-531439899/p1/output/_temporary/_attempt_201301102312_0002_r_000000_3/part-r-00000
for DFSClient_attempt_201301102312_0002_r_000000_3 on client
10.210.42.32 because current leaseholder is trying to recreate file.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>________________________________
>>>> From: Josh Wills <jo...@gmail.com>
>>>>To: Deepak Vohra <dv...@yahoo.com> 
>>>>Sent: Thursday, January 10, 2013 4:36:36 PM
>>>>Subject: Re: CRUNCH-140
>>>> 
>>>>
>>>>
>>>>K-- any info in the logs for the job?
>>>>
>>>>
>>>>
>>>>On Thu, Jan 10, 2013 at 4:35 PM, Deepak Vohra <dv...@yahoo.com> wrote:
>>>>
>>>>Yes. The input is required to be on the HDFS.
>>>>>
>>>>>
>>>>>But, the Crunch job still fails with the following output.
>>>>>13/01/11
00:10:45 INFO collect.PGroupedTableImpl: Setting num reduce tasks to
1 
>>>>>13/01/11
00:10:48 INFO input.FileInputFormat: Total input paths to process :
1 
>>>>>13/01/11
00:10:48 INFO exec.CrunchJob: Running job
"org.apache.crunch.examples.WordCount:
Text(LICENSE.txt)+S0+Aggregate.count+GBK+combine+asText+Text(crunch)" 
>>>>>13/01/11
00:10:48 INFO exec.CrunchJob: Job status available at: http://ec2-50-19-55-40.compute-1.amazonaws.com:50030/jobdetails.jsp?jobid=job_201301102312_0002 
>>>>>1
job failure(s) occurred: 
>>>>>org.apache.crunch.examples.WordCount:
Text(LICENSE.txt)+S0+Aggregate.count+GBK+combine+asText+Text(crunch)(class
org.apache.crunch.examples.WordCount0): Job failed! 
>>>>>
>>>>>
>>>>>
>>>>>________________________________
>>>>> From: Josh Wills <jo...@gmail.com>
>>>>>To: crunch-user@incubator.apache.org; Deepak Vohra <dv...@yahoo.com> 
>>>>>Sent: Thursday, January 10, 2013 1:26:35 PM
>>>>>Subject: Re: CRUNCH-140
>>>>> 
>>>>>
>>>>>
>>>>>Does running:
>>>>>
>>>>>
>>>>>hadoop fs -put LICENSE.txt .
>>>>>
>>>>>
>>>>>followed by hadoop jar ... fix it?
>>>>>
>>>>>
>>>>>
>>>>>On Thu, Jan 10, 2013 at 12:07 PM, Deepak Vohra <dv...@yahoo.com> wrote:
>>>>>
>>>>>The command/s to test bug
>>>>>>https://issues.apache.org/jira/browse/CRUNCH-140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549900#comment-13549900
>>>>>>
>>>>>>>cd /usr/local/apache-crunch-0.4.0-incubating-bin
>>>>>>>cp LICENSE LICENSE.txt
>>>>>>>sudo mkdir crunch
>>>>>>
>>>>>>>hadoop jar crunch-examples-0.4.0-incubating-job.jar org.apache.crunch.examples.WordCount
>>>>>>LICENSE.txt crunch  
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>>
>
>
>
>-- 
>
>Director of Data Science
>Cloudera
>Twitter: @josh_wills
>
>


-- 

Director of Data Science
Cloudera
Twitter: @josh_wills

Re: CRUNCH-140

Posted by Deepak Vohra <dv...@yahoo.com>.
Maybe a note should be added on Apache Crunch website about the minimum Hadoop version requirement.


Hadoop 1.0.1 fixes the error in 

org.apache.crunch.examples.WordCount 
org.apache.crunch.examples.TotalBytesByIP
org.apache.crunch.examples.AverageBytesByIP

even without setting   
<property>
<name>mapred.map.tasks.speculative.execution</name>
<value>false</value> </property> in mapred-site.xml.


________________________________
 From: Josh Wills <jw...@cloudera.com>
To: Deepak Vohra <dv...@yahoo.com> 
Cc: "crunch-user@incubator.apache.org" <cr...@incubator.apache.org> 
Sent: Friday, January 11, 2013 10:18:43 AM
Subject: Re: CRUNCH-140
 

Okay-- in that kind of environment, it's best to turn speculative execution off.



On Fri, Jan 11, 2013 at 10:17 AM, Deepak Vohra <dv...@yahoo.com> wrote:


>
>Yes. Single node  including a HBase cluster. Speculative execution is on by default, haven't set it to false.
>
>
>________________________________
> From: Josh Wills <jw...@cloudera.com>
>
>To: crunch-user@incubator.apache.org; Deepak Vohra <dv...@yahoo.com> 
>Sent: Friday, January 11, 2013 9:43:34 AM
>Subject: Re: CRUNCH-140
> 
>
>
>Are you running on a single node w/speculative execution turned on?
>
>
>
>On Fri, Jan 11, 2013 at 9:29 AM, Deepak Vohra <dv...@yahoo.com> wrote:
>
>
>>
>>Hadoop 1.0.0, which supports multiple output, also generates the same error:
>>
>>
>>2013-01-11 17:21:20,368 INFO org.apache.crunch.impl.mr.run.RTNode: Crunch exception in 'Text(crunch)' for input: [,61]
>>org.apache.crunch.impl.mr.run.CrunchRuntimeException: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /tmp/crunch-140947123/p1/output/_temporary/_attempt_201301111630_0001_r_000000_3/part-r-00000 for DFSClient_attempt_201301111630_0001_r_000000_3 on client 10.254.110.80 because current leaseholder is trying to recreate file.
>>
>>
>>
>>________________________________
>> From: Josh Wills <jo...@gmail.com>
>>To: Deepak Vohra <dv...@yahoo.com> 
>>Cc: "crunch-user@incubator.apache.org" <cr...@incubator.apache.org> 
>>Sent: Thursday, January 10, 2013 5:13:17 PM
>>Subject: Re: CRUNCH-140
>> 
>>
>>
>>Most likely, yes.
>>
>>
>>
>>On Thu, Jan 10, 2013 at 5:12 PM, Deepak Vohra <dv...@yahoo.com> wrote:
>>
>>
>>>
>>>Would the HBase bug or error CRUNCH-141 be also because of the Hadoop version being pre-1.0.x?
>>>
>>>
>>>
>>>
>>>
>>>________________________________
>>> From: Josh Wills <jo...@gmail.com>
>>>To: Deepak Vohra <dv...@yahoo.com> 
>>>Cc: "crunch-user@incubator.apache.org" <cr...@incubator.apache.org> 
>>>Sent: Thursday, January 10, 2013 5:10:04 PM
>>>Subject: Re: CRUNCH-140
>>> 
>>>
>>>
>>>Hrm-- I think that's b/c you're using an earlier version of Hadoop (pre-1.0.x) that doesn't support multiple outputs, which Crunch relies on.
>>>
>>>
>>>
>>>On Thu, Jan 10, 2013 at 5:08 PM, Deepak Vohra <dv...@yahoo.com> wrote:
>>>
>>>From userlogs:
>>>>
>>>>
>>>>WARN
org.apache.hadoop.mapred.Child: Error running child 
>>>>org.apache.crunch.impl.mr.run.CrunchRuntimeException:
org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed
to create file
/tmp/crunch-531439899/p1/output/_temporary/_attempt_201301102312_0002_r_000000_3/part-r-00000
for DFSClient_attempt_201301102312_0002_r_000000_3 on client
10.210.42.32 because current leaseholder is trying to recreate file.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>________________________________
>>>> From: Josh Wills <jo...@gmail.com>
>>>>To: Deepak Vohra <dv...@yahoo.com> 
>>>>Sent: Thursday, January 10, 2013 4:36:36 PM
>>>>Subject: Re: CRUNCH-140
>>>> 
>>>>
>>>>
>>>>K-- any info in the logs for the job?
>>>>
>>>>
>>>>
>>>>On Thu, Jan 10, 2013 at 4:35 PM, Deepak Vohra <dv...@yahoo.com> wrote:
>>>>
>>>>Yes. The input is required to be on the HDFS.
>>>>>
>>>>>
>>>>>But, the Crunch job still fails with the following output.
>>>>>13/01/11
00:10:45 INFO collect.PGroupedTableImpl: Setting num reduce tasks to
1 
>>>>>13/01/11
00:10:48 INFO input.FileInputFormat: Total input paths to process :
1 
>>>>>13/01/11
00:10:48 INFO exec.CrunchJob: Running job
"org.apache.crunch.examples.WordCount:
Text(LICENSE.txt)+S0+Aggregate.count+GBK+combine+asText+Text(crunch)" 
>>>>>13/01/11
00:10:48 INFO exec.CrunchJob: Job status available at: http://ec2-50-19-55-40.compute-1.amazonaws.com:50030/jobdetails.jsp?jobid=job_201301102312_0002 
>>>>>1
job failure(s) occurred: 
>>>>>org.apache.crunch.examples.WordCount:
Text(LICENSE.txt)+S0+Aggregate.count+GBK+combine+asText+Text(crunch)(class
org.apache.crunch.examples.WordCount0): Job failed! 
>>>>>
>>>>>
>>>>>
>>>>>________________________________
>>>>> From: Josh Wills <jo...@gmail.com>
>>>>>To: crunch-user@incubator.apache.org; Deepak Vohra <dv...@yahoo.com> 
>>>>>Sent: Thursday, January 10, 2013 1:26:35 PM
>>>>>Subject: Re: CRUNCH-140
>>>>> 
>>>>>
>>>>>
>>>>>Does running:
>>>>>
>>>>>
>>>>>hadoop fs -put LICENSE.txt .
>>>>>
>>>>>
>>>>>followed by hadoop jar ... fix it?
>>>>>
>>>>>
>>>>>
>>>>>On Thu, Jan 10, 2013 at 12:07 PM, Deepak Vohra <dv...@yahoo.com> wrote:
>>>>>
>>>>>The command/s to test bug
>>>>>>https://issues.apache.org/jira/browse/CRUNCH-140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549900#comment-13549900
>>>>>>
>>>>>>>cd /usr/local/apache-crunch-0.4.0-incubating-bin
>>>>>>>cp LICENSE LICENSE.txt
>>>>>>>sudo mkdir crunch
>>>>>>
>>>>>>>hadoop jar crunch-examples-0.4.0-incubating-job.jar org.apache.crunch.examples.WordCount
>>>>>>LICENSE.txt crunch  
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>>
>
>
>
>-- 
>
>Director of Data Science
>Cloudera
>Twitter: @josh_wills
>
>


-- 

Director of Data Science
Cloudera
Twitter: @josh_wills

Re: CRUNCH-140

Posted by Josh Wills <jw...@cloudera.com>.
Okay-- in that kind of environment, it's best to turn speculative execution
off.


On Fri, Jan 11, 2013 at 10:17 AM, Deepak Vohra <dv...@yahoo.com> wrote:

>
> Yes. Single node  including a HBase cluster. Speculative execution is on
> by default, haven't set it to false.
>    ------------------------------
> *From:* Josh Wills <jw...@cloudera.com>
>
> *To:* crunch-user@incubator.apache.org; Deepak Vohra <dv...@yahoo.com>
> *Sent:* Friday, January 11, 2013 9:43:34 AM
> *Subject:* Re: CRUNCH-140
>
> Are you running on a single node w/speculative execution turned on?
>
>
> On Fri, Jan 11, 2013 at 9:29 AM, Deepak Vohra <dv...@yahoo.com> wrote:
>
>
> Hadoop 1.0.0, which supports multiple output, also generates the same
> error:
>
> 2013-01-11 17:21:20,368 INFO org.apache.crunch.impl.mr.run.RTNode: Crunch
> exception in 'Text(crunch)' for input: [,61]
> org.apache.crunch.impl.mr.run.CrunchRuntimeException:
> org.apache.hadoop.ipc.RemoteException:
> org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to
> create file
> /tmp/crunch-140947123/p1/output/_temporary/_attempt_201301111630_0001_r_000000_3/part-r-00000
> for DFSClient_attempt_201301111630_0001_r_000000_3 on client 10.254.110.80
> because current leaseholder is trying to recreate file.
>
>   ------------------------------
> *From:* Josh Wills <jo...@gmail.com>
> *To:* Deepak Vohra <dv...@yahoo.com>
> *Cc:* "crunch-user@incubator.apache.org" <cr...@incubator.apache.org>
>
> *Sent:* Thursday, January 10, 2013 5:13:17 PM
> *Subject:* Re: CRUNCH-140
>
> Most likely, yes.
>
>
> On Thu, Jan 10, 2013 at 5:12 PM, Deepak Vohra <dv...@yahoo.com> wrote:
>
>
> Would the HBase bug or error CRUNCH-141 be also because of the Hadoop
> version being pre-1.0.x?
>
>
>   ------------------------------
> *From:* Josh Wills <jo...@gmail.com>
> *To:* Deepak Vohra <dv...@yahoo.com>
> *Cc:* "crunch-user@incubator.apache.org" <cr...@incubator.apache.org>
>
> *Sent:* Thursday, January 10, 2013 5:10:04 PM
> *Subject:* Re: CRUNCH-140
>
> Hrm-- I think that's b/c you're using an earlier version of Hadoop
> (pre-1.0.x) that doesn't support multiple outputs, which Crunch relies on.
>
>
> On Thu, Jan 10, 2013 at 5:08 PM, Deepak Vohra <dv...@yahoo.com> wrote:
>
> From userlogs:
>
> WARN org.apache.hadoop.mapred.Child: Error running child
> org.apache.crunch.impl.mr.run.CrunchRuntimeException:
> org.apache.hadoop.ipc.RemoteException:
> org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to
> create file
> /tmp/crunch-531439899/p1/output/_temporary/_attempt_201301102312_0002_r_000000_3/part-r-00000
> for DFSClient_attempt_201301102312_0002_r_000000_3 on client 10.210.42.32
> because current leaseholder is trying to recreate file.
>
>
>
>
>   ------------------------------
> *From:* Josh Wills <jo...@gmail.com>
> *To:* Deepak Vohra <dv...@yahoo.com>
> *Sent:* Thursday, January 10, 2013 4:36:36 PM
> *Subject:* Re: CRUNCH-140
>
> K-- any info in the logs for the job?
>
>
> On Thu, Jan 10, 2013 at 4:35 PM, Deepak Vohra <dv...@yahoo.com> wrote:
>
> Yes. The input is required to be on the HDFS.
>
> But, the Crunch job still fails with the following output.
> 13/01/11 00:10:45 INFO collect.PGroupedTableImpl: Setting num reduce tasks
> to 1
> 13/01/11 00:10:48 INFO input.FileInputFormat: Total input paths to process
> : 1
> 13/01/11 00:10:48 INFO exec.CrunchJob: Running job
> "org.apache.crunch.examples.WordCount:
> Text(LICENSE.txt)+S0+Aggregate.count+GBK+combine+asText+Text(crunch)"
> 13/01/11 00:10:48 INFO exec.CrunchJob: Job status available at:
> http://ec2-50-19-55-40.compute-1.amazonaws.com:50030/jobdetails.jsp?jobid=job_201301102312_0002
> 1 job failure(s) occurred:
> org.apache.crunch.examples.WordCount:
> Text(LICENSE.txt)+S0+Aggregate.count+GBK+combine+asText+Text(crunch)(class
> org.apache.crunch.examples.WordCount0): Job failed!
>
>    ------------------------------
> *From:* Josh Wills <jo...@gmail.com>
> *To:* crunch-user@incubator.apache.org; Deepak Vohra <dv...@yahoo.com>
> *Sent:* Thursday, January 10, 2013 1:26:35 PM
> *Subject:* Re: CRUNCH-140
>
> Does running:
>
> hadoop fs -put LICENSE.txt .
>
> followed by hadoop jar ... fix it?
>
>
> On Thu, Jan 10, 2013 at 12:07 PM, Deepak Vohra <dv...@yahoo.com> wrote:
>
> The command/s to test bug
>
> https://issues.apache.org/jira/browse/CRUNCH-140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549900#comment-13549900
>
> >cd /usr/local/apache-crunch-0.4.0-incubating-bin
> >cp LICENSE LICENSE.txt
> >sudo mkdir crunch
>
> >hadoop jar crunch-examples-0.4.0-incubating-job.jar
> org.apache.crunch.examples.WordCount
> LICENSE.txt crunch
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> --
> Director of Data Science
> Cloudera <http://www.cloudera.com/>
> Twitter: @josh_wills <http://twitter.com/josh_wills>
>
>
>


-- 
Director of Data Science
Cloudera <http://www.cloudera.com>
Twitter: @josh_wills <http://twitter.com/josh_wills>

Re: CRUNCH-140

Posted by Deepak Vohra <dv...@yahoo.com>.

Yes. Single node  including a HBase cluster. Speculative execution is on by default, haven't set it to false.


________________________________
 From: Josh Wills <jw...@cloudera.com>
To: crunch-user@incubator.apache.org; Deepak Vohra <dv...@yahoo.com> 
Sent: Friday, January 11, 2013 9:43:34 AM
Subject: Re: CRUNCH-140
 

Are you running on a single node w/speculative execution turned on?



On Fri, Jan 11, 2013 at 9:29 AM, Deepak Vohra <dv...@yahoo.com> wrote:


>
>Hadoop 1.0.0, which supports multiple output, also generates the same error:
>
>
>2013-01-11 17:21:20,368 INFO org.apache.crunch.impl.mr.run.RTNode: Crunch exception in 'Text(crunch)' for input: [,61]
>org.apache.crunch.impl.mr.run.CrunchRuntimeException: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /tmp/crunch-140947123/p1/output/_temporary/_attempt_201301111630_0001_r_000000_3/part-r-00000 for DFSClient_attempt_201301111630_0001_r_000000_3 on client 10.254.110.80 because current leaseholder is trying to recreate file.
>
>
>
>________________________________
> From: Josh Wills <jo...@gmail.com>
>To: Deepak Vohra <dv...@yahoo.com> 
>Cc: "crunch-user@incubator.apache.org" <cr...@incubator.apache.org> 
>Sent: Thursday, January 10, 2013 5:13:17 PM
>Subject: Re: CRUNCH-140
> 
>
>
>Most likely, yes.
>
>
>
>On Thu, Jan 10, 2013 at 5:12 PM, Deepak Vohra <dv...@yahoo.com> wrote:
>
>
>>
>>Would the HBase bug or error CRUNCH-141 be also because of the Hadoop version being pre-1.0.x?
>>
>>
>>
>>
>>
>>________________________________
>> From: Josh Wills <jo...@gmail.com>
>>To: Deepak Vohra <dv...@yahoo.com> 
>>Cc: "crunch-user@incubator.apache.org" <cr...@incubator.apache.org> 
>>Sent: Thursday, January 10, 2013 5:10:04 PM
>>Subject: Re: CRUNCH-140
>> 
>>
>>
>>Hrm-- I think that's b/c you're using an earlier version of Hadoop (pre-1.0.x) that doesn't support multiple outputs, which Crunch relies on.
>>
>>
>>
>>On Thu, Jan 10, 2013 at 5:08 PM, Deepak Vohra <dv...@yahoo.com> wrote:
>>
>>From userlogs:
>>>
>>>
>>>WARN
org.apache.hadoop.mapred.Child: Error running child 
>>>org.apache.crunch.impl.mr.run.CrunchRuntimeException:
org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed
to create file
/tmp/crunch-531439899/p1/output/_temporary/_attempt_201301102312_0002_r_000000_3/part-r-00000
for DFSClient_attempt_201301102312_0002_r_000000_3 on client
10.210.42.32 because current leaseholder is trying to recreate file.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>________________________________
>>> From: Josh Wills <jo...@gmail.com>
>>>To: Deepak Vohra <dv...@yahoo.com> 
>>>Sent: Thursday, January 10, 2013 4:36:36 PM
>>>Subject: Re: CRUNCH-140
>>> 
>>>
>>>
>>>K-- any info in the logs for the job?
>>>
>>>
>>>
>>>On Thu, Jan 10, 2013 at 4:35 PM, Deepak Vohra <dv...@yahoo.com> wrote:
>>>
>>>Yes. The input is required to be on the HDFS.
>>>>
>>>>
>>>>But, the Crunch job still fails with the following output.
>>>>13/01/11
00:10:45 INFO collect.PGroupedTableImpl: Setting num reduce tasks to
1 
>>>>13/01/11
00:10:48 INFO input.FileInputFormat: Total input paths to process :
1 
>>>>13/01/11
00:10:48 INFO exec.CrunchJob: Running job
"org.apache.crunch.examples.WordCount:
Text(LICENSE.txt)+S0+Aggregate.count+GBK+combine+asText+Text(crunch)" 
>>>>13/01/11
00:10:48 INFO exec.CrunchJob: Job status available at: http://ec2-50-19-55-40.compute-1.amazonaws.com:50030/jobdetails.jsp?jobid=job_201301102312_0002 
>>>>1
job failure(s) occurred: 
>>>>org.apache.crunch.examples.WordCount:
Text(LICENSE.txt)+S0+Aggregate.count+GBK+combine+asText+Text(crunch)(class
org.apache.crunch.examples.WordCount0): Job failed! 
>>>>
>>>>
>>>>
>>>>________________________________
>>>> From: Josh Wills <jo...@gmail.com>
>>>>To: crunch-user@incubator.apache.org; Deepak Vohra <dv...@yahoo.com> 
>>>>Sent: Thursday, January 10, 2013 1:26:35 PM
>>>>Subject: Re: CRUNCH-140
>>>> 
>>>>
>>>>
>>>>Does running:
>>>>
>>>>
>>>>hadoop fs -put LICENSE.txt .
>>>>
>>>>
>>>>followed by hadoop jar ... fix it?
>>>>
>>>>
>>>>
>>>>On Thu, Jan 10, 2013 at 12:07 PM, Deepak Vohra <dv...@yahoo.com> wrote:
>>>>
>>>>The command/s to test bug
>>>>>https://issues.apache.org/jira/browse/CRUNCH-140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549900#comment-13549900
>>>>>
>>>>>>cd /usr/local/apache-crunch-0.4.0-incubating-bin
>>>>>>cp LICENSE LICENSE.txt
>>>>>>sudo mkdir crunch
>>>>>
>>>>>>hadoop jar crunch-examples-0.4.0-incubating-job.jar org.apache.crunch.examples.WordCount
>>>>>LICENSE.txt crunch  
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>>
>
>
>


-- 

Director of Data Science
Cloudera
Twitter: @josh_wills

Re: CRUNCH-140

Posted by Josh Wills <jw...@cloudera.com>.
Are you running on a single node w/speculative execution turned on?


On Fri, Jan 11, 2013 at 9:29 AM, Deepak Vohra <dv...@yahoo.com> wrote:

>
> Hadoop 1.0.0, which supports multiple output, also generates the same
> error:
>
> 2013-01-11 17:21:20,368 INFO org.apache.crunch.impl.mr.run.RTNode: Crunch
> exception in 'Text(crunch)' for input: [,61]
> org.apache.crunch.impl.mr.run.CrunchRuntimeException:
> org.apache.hadoop.ipc.RemoteException:
> org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to
> create file
> /tmp/crunch-140947123/p1/output/_temporary/_attempt_201301111630_0001_r_000000_3/part-r-00000
> for DFSClient_attempt_201301111630_0001_r_000000_3 on client 10.254.110.80
> because current leaseholder is trying to recreate file.
>
>   ------------------------------
> *From:* Josh Wills <jo...@gmail.com>
> *To:* Deepak Vohra <dv...@yahoo.com>
> *Cc:* "crunch-user@incubator.apache.org" <cr...@incubator.apache.org>
>
> *Sent:* Thursday, January 10, 2013 5:13:17 PM
> *Subject:* Re: CRUNCH-140
>
> Most likely, yes.
>
>
> On Thu, Jan 10, 2013 at 5:12 PM, Deepak Vohra <dv...@yahoo.com> wrote:
>
>
> Would the HBase bug or error CRUNCH-141 be also because of the Hadoop
> version being pre-1.0.x?
>
>
>   ------------------------------
> *From:* Josh Wills <jo...@gmail.com>
> *To:* Deepak Vohra <dv...@yahoo.com>
> *Cc:* "crunch-user@incubator.apache.org" <cr...@incubator.apache.org>
>
> *Sent:* Thursday, January 10, 2013 5:10:04 PM
> *Subject:* Re: CRUNCH-140
>
> Hrm-- I think that's b/c you're using an earlier version of Hadoop
> (pre-1.0.x) that doesn't support multiple outputs, which Crunch relies on.
>
>
> On Thu, Jan 10, 2013 at 5:08 PM, Deepak Vohra <dv...@yahoo.com> wrote:
>
> From userlogs:
>
> WARN org.apache.hadoop.mapred.Child: Error running child
> org.apache.crunch.impl.mr.run.CrunchRuntimeException:
> org.apache.hadoop.ipc.RemoteException:
> org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to
> create file
> /tmp/crunch-531439899/p1/output/_temporary/_attempt_201301102312_0002_r_000000_3/part-r-00000
> for DFSClient_attempt_201301102312_0002_r_000000_3 on client 10.210.42.32
> because current leaseholder is trying to recreate file.
>
>
>
>
>   ------------------------------
> *From:* Josh Wills <jo...@gmail.com>
> *To:* Deepak Vohra <dv...@yahoo.com>
> *Sent:* Thursday, January 10, 2013 4:36:36 PM
> *Subject:* Re: CRUNCH-140
>
> K-- any info in the logs for the job?
>
>
> On Thu, Jan 10, 2013 at 4:35 PM, Deepak Vohra <dv...@yahoo.com> wrote:
>
> Yes. The input is required to be on the HDFS.
>
> But, the Crunch job still fails with the following output.
> 13/01/11 00:10:45 INFO collect.PGroupedTableImpl: Setting num reduce tasks
> to 1
> 13/01/11 00:10:48 INFO input.FileInputFormat: Total input paths to process
> : 1
> 13/01/11 00:10:48 INFO exec.CrunchJob: Running job
> "org.apache.crunch.examples.WordCount:
> Text(LICENSE.txt)+S0+Aggregate.count+GBK+combine+asText+Text(crunch)"
> 13/01/11 00:10:48 INFO exec.CrunchJob: Job status available at:
> http://ec2-50-19-55-40.compute-1.amazonaws.com:50030/jobdetails.jsp?jobid=job_201301102312_0002
> 1 job failure(s) occurred:
> org.apache.crunch.examples.WordCount:
> Text(LICENSE.txt)+S0+Aggregate.count+GBK+combine+asText+Text(crunch)(class
> org.apache.crunch.examples.WordCount0): Job failed!
>
>    ------------------------------
> *From:* Josh Wills <jo...@gmail.com>
> *To:* crunch-user@incubator.apache.org; Deepak Vohra <dv...@yahoo.com>
> *Sent:* Thursday, January 10, 2013 1:26:35 PM
> *Subject:* Re: CRUNCH-140
>
> Does running:
>
> hadoop fs -put LICENSE.txt .
>
> followed by hadoop jar ... fix it?
>
>
> On Thu, Jan 10, 2013 at 12:07 PM, Deepak Vohra <dv...@yahoo.com> wrote:
>
> The command/s to test bug
>
> https://issues.apache.org/jira/browse/CRUNCH-140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549900#comment-13549900
>
> >cd /usr/local/apache-crunch-0.4.0-incubating-bin
> >cp LICENSE LICENSE.txt
> >sudo mkdir crunch
>
> >hadoop jar crunch-examples-0.4.0-incubating-job.jar
> org.apache.crunch.examples.WordCount
> LICENSE.txt crunch
>
>
>
>
>
>
>
>
>
>
>
>
>
>


-- 
Director of Data Science
Cloudera <http://www.cloudera.com>
Twitter: @josh_wills <http://twitter.com/josh_wills>

Re: CRUNCH-140

Posted by Deepak Vohra <dv...@yahoo.com>.

Hadoop 1.0.0, which supports multiple output, also generates the same error:

2013-01-11 17:21:20,368 INFO org.apache.crunch.impl.mr.run.RTNode: Crunch exception in 'Text(crunch)' for input: [,61]
org.apache.crunch.impl.mr.run.CrunchRuntimeException: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /tmp/crunch-140947123/p1/output/_temporary/_attempt_201301111630_0001_r_000000_3/part-r-00000 for DFSClient_attempt_201301111630_0001_r_000000_3 on client 10.254.110.80 because current leaseholder is trying to recreate file.



________________________________
 From: Josh Wills <jo...@gmail.com>
To: Deepak Vohra <dv...@yahoo.com> 
Cc: "crunch-user@incubator.apache.org" <cr...@incubator.apache.org> 
Sent: Thursday, January 10, 2013 5:13:17 PM
Subject: Re: CRUNCH-140
 

Most likely, yes.



On Thu, Jan 10, 2013 at 5:12 PM, Deepak Vohra <dv...@yahoo.com> wrote:


>
>Would the HBase bug or error CRUNCH-141 be also because of the Hadoop version being pre-1.0.x?
>
>
>
>
>
>________________________________
> From: Josh Wills <jo...@gmail.com>
>To: Deepak Vohra <dv...@yahoo.com> 
>Cc: "crunch-user@incubator.apache.org" <cr...@incubator.apache.org> 
>Sent: Thursday, January 10, 2013 5:10:04 PM
>Subject: Re: CRUNCH-140
> 
>
>
>Hrm-- I think that's b/c you're using an earlier version of Hadoop (pre-1.0.x) that doesn't support multiple outputs, which Crunch relies on.
>
>
>
>On Thu, Jan 10, 2013 at 5:08 PM, Deepak Vohra <dv...@yahoo.com> wrote:
>
>From userlogs:
>>
>>
>>WARN
org.apache.hadoop.mapred.Child: Error running child 
>>org.apache.crunch.impl.mr.run.CrunchRuntimeException:
org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed
to create file
/tmp/crunch-531439899/p1/output/_temporary/_attempt_201301102312_0002_r_000000_3/part-r-00000
for DFSClient_attempt_201301102312_0002_r_000000_3 on client
10.210.42.32 because current leaseholder is trying to recreate file.
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>________________________________
>> From: Josh Wills <jo...@gmail.com>
>>To: Deepak Vohra <dv...@yahoo.com> 
>>Sent: Thursday, January 10, 2013 4:36:36 PM
>>Subject: Re: CRUNCH-140
>> 
>>
>>
>>K-- any info in the logs for the job?
>>
>>
>>
>>On Thu, Jan 10, 2013 at 4:35 PM, Deepak Vohra <dv...@yahoo.com> wrote:
>>
>>Yes. The input is required to be on the HDFS.
>>>
>>>
>>>But, the Crunch job still fails with the following output.
>>>13/01/11
00:10:45 INFO collect.PGroupedTableImpl: Setting num reduce tasks to
1 
>>>13/01/11
00:10:48 INFO input.FileInputFormat: Total input paths to process :
1 
>>>13/01/11
00:10:48 INFO exec.CrunchJob: Running job
"org.apache.crunch.examples.WordCount:
Text(LICENSE.txt)+S0+Aggregate.count+GBK+combine+asText+Text(crunch)" 
>>>13/01/11
00:10:48 INFO exec.CrunchJob: Job status available at: http://ec2-50-19-55-40.compute-1.amazonaws.com:50030/jobdetails.jsp?jobid=job_201301102312_0002 
>>>1
job failure(s) occurred: 
>>>org.apache.crunch.examples.WordCount:
Text(LICENSE.txt)+S0+Aggregate.count+GBK+combine+asText+Text(crunch)(class
org.apache.crunch.examples.WordCount0): Job failed! 
>>>
>>>
>>>
>>>________________________________
>>> From: Josh Wills <jo...@gmail.com>
>>>To: crunch-user@incubator.apache.org; Deepak Vohra <dv...@yahoo.com> 
>>>Sent: Thursday, January 10, 2013 1:26:35 PM
>>>Subject: Re: CRUNCH-140
>>> 
>>>
>>>
>>>Does running:
>>>
>>>
>>>hadoop fs -put LICENSE.txt .
>>>
>>>
>>>followed by hadoop jar ... fix it?
>>>
>>>
>>>
>>>On Thu, Jan 10, 2013 at 12:07 PM, Deepak Vohra <dv...@yahoo.com> wrote:
>>>
>>>The command/s to test bug
>>>>https://issues.apache.org/jira/browse/CRUNCH-140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549900#comment-13549900
>>>>
>>>>>cd /usr/local/apache-crunch-0.4.0-incubating-bin
>>>>>cp LICENSE LICENSE.txt
>>>>>sudo mkdir crunch
>>>>
>>>>>hadoop jar crunch-examples-0.4.0-incubating-job.jar org.apache.crunch.examples.WordCount
>>>>LICENSE.txt crunch  
>>>
>>>
>>>
>>
>>
>>
>
>
>

Re: CRUNCH-140

Posted by Josh Wills <jo...@gmail.com>.
Most likely, yes.


On Thu, Jan 10, 2013 at 5:12 PM, Deepak Vohra <dv...@yahoo.com> wrote:

>
> Would the HBase bug or error CRUNCH-141 be also because of the Hadoop
> version being pre-1.0.x?
>
>
>   ------------------------------
> *From:* Josh Wills <jo...@gmail.com>
> *To:* Deepak Vohra <dv...@yahoo.com>
> *Cc:* "crunch-user@incubator.apache.org" <cr...@incubator.apache.org>
>
> *Sent:* Thursday, January 10, 2013 5:10:04 PM
> *Subject:* Re: CRUNCH-140
>
> Hrm-- I think that's b/c you're using an earlier version of Hadoop
> (pre-1.0.x) that doesn't support multiple outputs, which Crunch relies on.
>
>
> On Thu, Jan 10, 2013 at 5:08 PM, Deepak Vohra <dv...@yahoo.com> wrote:
>
> From userlogs:
>
> WARN org.apache.hadoop.mapred.Child: Error running child
> org.apache.crunch.impl.mr.run.CrunchRuntimeException:
> org.apache.hadoop.ipc.RemoteException:
> org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to
> create file
> /tmp/crunch-531439899/p1/output/_temporary/_attempt_201301102312_0002_r_000000_3/part-r-00000
> for DFSClient_attempt_201301102312_0002_r_000000_3 on client 10.210.42.32
> because current leaseholder is trying to recreate file.
>
>
>
>
>   ------------------------------
> *From:* Josh Wills <jo...@gmail.com>
> *To:* Deepak Vohra <dv...@yahoo.com>
> *Sent:* Thursday, January 10, 2013 4:36:36 PM
> *Subject:* Re: CRUNCH-140
>
> K-- any info in the logs for the job?
>
>
> On Thu, Jan 10, 2013 at 4:35 PM, Deepak Vohra <dv...@yahoo.com> wrote:
>
> Yes. The input is required to be on the HDFS.
>
> But, the Crunch job still fails with the following output.
> 13/01/11 00:10:45 INFO collect.PGroupedTableImpl: Setting num reduce tasks
> to 1
> 13/01/11 00:10:48 INFO input.FileInputFormat: Total input paths to process
> : 1
> 13/01/11 00:10:48 INFO exec.CrunchJob: Running job
> "org.apache.crunch.examples.WordCount:
> Text(LICENSE.txt)+S0+Aggregate.count+GBK+combine+asText+Text(crunch)"
> 13/01/11 00:10:48 INFO exec.CrunchJob: Job status available at:
> http://ec2-50-19-55-40.compute-1.amazonaws.com:50030/jobdetails.jsp?jobid=job_201301102312_0002
> 1 job failure(s) occurred:
> org.apache.crunch.examples.WordCount:
> Text(LICENSE.txt)+S0+Aggregate.count+GBK+combine+asText+Text(crunch)(class
> org.apache.crunch.examples.WordCount0): Job failed!
>
>    ------------------------------
> *From:* Josh Wills <jo...@gmail.com>
> *To:* crunch-user@incubator.apache.org; Deepak Vohra <dv...@yahoo.com>
> *Sent:* Thursday, January 10, 2013 1:26:35 PM
> *Subject:* Re: CRUNCH-140
>
> Does running:
>
> hadoop fs -put LICENSE.txt .
>
> followed by hadoop jar ... fix it?
>
>
> On Thu, Jan 10, 2013 at 12:07 PM, Deepak Vohra <dv...@yahoo.com> wrote:
>
> The command/s to test bug
>
> https://issues.apache.org/jira/browse/CRUNCH-140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549900#comment-13549900
>
> >cd /usr/local/apache-crunch-0.4.0-incubating-bin
> >cp LICENSE LICENSE.txt
> >sudo mkdir crunch
>
> >hadoop jar crunch-examples-0.4.0-incubating-job.jar
> org.apache.crunch.examples.WordCount
> LICENSE.txt crunch
>
>
>
>
>
>
>
>
>
>
>

Re: CRUNCH-140

Posted by Deepak Vohra <dv...@yahoo.com>.

Would the HBase bug or error CRUNCH-141 be also because of the Hadoop version being pre-1.0.x?



________________________________
 From: Josh Wills <jo...@gmail.com>
To: Deepak Vohra <dv...@yahoo.com> 
Cc: "crunch-user@incubator.apache.org" <cr...@incubator.apache.org> 
Sent: Thursday, January 10, 2013 5:10:04 PM
Subject: Re: CRUNCH-140
 

Hrm-- I think that's b/c you're using an earlier version of Hadoop (pre-1.0.x) that doesn't support multiple outputs, which Crunch relies on.



On Thu, Jan 10, 2013 at 5:08 PM, Deepak Vohra <dv...@yahoo.com> wrote:

From userlogs:
>
>
>WARN
org.apache.hadoop.mapred.Child: Error running child 
>org.apache.crunch.impl.mr.run.CrunchRuntimeException:
org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed
to create file
/tmp/crunch-531439899/p1/output/_temporary/_attempt_201301102312_0002_r_000000_3/part-r-00000
for DFSClient_attempt_201301102312_0002_r_000000_3 on client
10.210.42.32 because current leaseholder is trying to recreate file.
>
>
>
>
>
>
>
>
>
>________________________________
> From: Josh Wills <jo...@gmail.com>
>To: Deepak Vohra <dv...@yahoo.com> 
>Sent: Thursday, January 10, 2013 4:36:36 PM
>Subject: Re: CRUNCH-140
> 
>
>
>K-- any info in the logs for the job?
>
>
>
>On Thu, Jan 10, 2013 at 4:35 PM, Deepak Vohra <dv...@yahoo.com> wrote:
>
>Yes. The input is required to be on the HDFS.
>>
>>
>>But, the Crunch job still fails with the following output.
>>13/01/11
00:10:45 INFO collect.PGroupedTableImpl: Setting num reduce tasks to
1 
>>13/01/11
00:10:48 INFO input.FileInputFormat: Total input paths to process :
1 
>>13/01/11
00:10:48 INFO exec.CrunchJob: Running job
"org.apache.crunch.examples.WordCount:
Text(LICENSE.txt)+S0+Aggregate.count+GBK+combine+asText+Text(crunch)" 
>>13/01/11
00:10:48 INFO exec.CrunchJob: Job status available at: http://ec2-50-19-55-40.compute-1.amazonaws.com:50030/jobdetails.jsp?jobid=job_201301102312_0002 
>>1
job failure(s) occurred: 
>>org.apache.crunch.examples.WordCount:
Text(LICENSE.txt)+S0+Aggregate.count+GBK+combine+asText+Text(crunch)(class
org.apache.crunch.examples.WordCount0): Job failed! 
>>
>>
>>
>>________________________________
>> From: Josh Wills <jo...@gmail.com>
>>To: crunch-user@incubator.apache.org; Deepak Vohra <dv...@yahoo.com> 
>>Sent: Thursday, January 10, 2013 1:26:35 PM
>>Subject: Re: CRUNCH-140
>> 
>>
>>
>>Does running:
>>
>>
>>hadoop fs -put LICENSE.txt .
>>
>>
>>followed by hadoop jar ... fix it?
>>
>>
>>
>>On Thu, Jan 10, 2013 at 12:07 PM, Deepak Vohra <dv...@yahoo.com> wrote:
>>
>>The command/s to test bug
>>>https://issues.apache.org/jira/browse/CRUNCH-140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549900#comment-13549900
>>>
>>>>cd /usr/local/apache-crunch-0.4.0-incubating-bin
>>>>cp LICENSE LICENSE.txt
>>>>sudo mkdir crunch
>>>
>>>>hadoop jar crunch-examples-0.4.0-incubating-job.jar org.apache.crunch.examples.WordCount
>>>LICENSE.txt crunch  
>>
>>
>>
>
>
>

Re: CRUNCH-140

Posted by Josh Wills <jo...@gmail.com>.
Hrm-- I think that's b/c you're using an earlier version of Hadoop
(pre-1.0.x) that doesn't support multiple outputs, which Crunch relies on.


On Thu, Jan 10, 2013 at 5:08 PM, Deepak Vohra <dv...@yahoo.com> wrote:

> From userlogs:
>
> WARN org.apache.hadoop.mapred.Child: Error running child
> org.apache.crunch.impl.mr.run.CrunchRuntimeException:
> org.apache.hadoop.ipc.RemoteException:
> org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to
> create file
> /tmp/crunch-531439899/p1/output/_temporary/_attempt_201301102312_0002_r_000000_3/part-r-00000
> for DFSClient_attempt_201301102312_0002_r_000000_3 on client 10.210.42.32
> because current leaseholder is trying to recreate file.
>
>
>
>
>   ------------------------------
> *From:* Josh Wills <jo...@gmail.com>
> *To:* Deepak Vohra <dv...@yahoo.com>
> *Sent:* Thursday, January 10, 2013 4:36:36 PM
> *Subject:* Re: CRUNCH-140
>
> K-- any info in the logs for the job?
>
>
> On Thu, Jan 10, 2013 at 4:35 PM, Deepak Vohra <dv...@yahoo.com> wrote:
>
> Yes. The input is required to be on the HDFS.
>
> But, the Crunch job still fails with the following output.
> 13/01/11 00:10:45 INFO collect.PGroupedTableImpl: Setting num reduce tasks
> to 1
> 13/01/11 00:10:48 INFO input.FileInputFormat: Total input paths to process
> : 1
> 13/01/11 00:10:48 INFO exec.CrunchJob: Running job
> "org.apache.crunch.examples.WordCount:
> Text(LICENSE.txt)+S0+Aggregate.count+GBK+combine+asText+Text(crunch)"
> 13/01/11 00:10:48 INFO exec.CrunchJob: Job status available at:
> http://ec2-50-19-55-40.compute-1.amazonaws.com:50030/jobdetails.jsp?jobid=job_201301102312_0002
> 1 job failure(s) occurred:
> org.apache.crunch.examples.WordCount:
> Text(LICENSE.txt)+S0+Aggregate.count+GBK+combine+asText+Text(crunch)(class
> org.apache.crunch.examples.WordCount0): Job failed!
>
>    ------------------------------
> *From:* Josh Wills <jo...@gmail.com>
> *To:* crunch-user@incubator.apache.org; Deepak Vohra <dv...@yahoo.com>
> *Sent:* Thursday, January 10, 2013 1:26:35 PM
> *Subject:* Re: CRUNCH-140
>
> Does running:
>
> hadoop fs -put LICENSE.txt .
>
> followed by hadoop jar ... fix it?
>
>
> On Thu, Jan 10, 2013 at 12:07 PM, Deepak Vohra <dv...@yahoo.com> wrote:
>
> The command/s to test bug
>
> https://issues.apache.org/jira/browse/CRUNCH-140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549900#comment-13549900
>
> >cd /usr/local/apache-crunch-0.4.0-incubating-bin
> >cp LICENSE LICENSE.txt
> >sudo mkdir crunch
>
> >hadoop jar crunch-examples-0.4.0-incubating-job.jar
> org.apache.crunch.examples.WordCount
> LICENSE.txt crunch
>
>
>
>
>
>
>
>

Re: CRUNCH-140

Posted by Deepak Vohra <dv...@yahoo.com>.
>From userlogs:
 
WARN
org.apache.hadoop.mapred.Child: Error running child 
org.apache.crunch.impl.mr.run.CrunchRuntimeException:
org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed
to create file
/tmp/crunch-531439899/p1/output/_temporary/_attempt_201301102312_0002_r_000000_3/part-r-00000
for DFSClient_attempt_201301102312_0002_r_000000_3 on client
10.210.42.32 because current leaseholder is trying to recreate file.





________________________________
 From: Josh Wills <jo...@gmail.com>
To: Deepak Vohra <dv...@yahoo.com> 
Sent: Thursday, January 10, 2013 4:36:36 PM
Subject: Re: CRUNCH-140
 

K-- any info in the logs for the job?



On Thu, Jan 10, 2013 at 4:35 PM, Deepak Vohra <dv...@yahoo.com> wrote:

Yes. The input is required to be on the HDFS.
>
>
>But, the Crunch job still fails with the following output.
>13/01/11
00:10:45 INFO collect.PGroupedTableImpl: Setting num reduce tasks to
1 
>13/01/11
00:10:48 INFO input.FileInputFormat: Total input paths to process :
1 
>13/01/11
00:10:48 INFO exec.CrunchJob: Running job
"org.apache.crunch.examples.WordCount:
Text(LICENSE.txt)+S0+Aggregate.count+GBK+combine+asText+Text(crunch)" 
>13/01/11
00:10:48 INFO exec.CrunchJob: Job status available at: http://ec2-50-19-55-40.compute-1.amazonaws.com:50030/jobdetails.jsp?jobid=job_201301102312_0002 
>1
job failure(s) occurred: 
>org.apache.crunch.examples.WordCount:
Text(LICENSE.txt)+S0+Aggregate.count+GBK+combine+asText+Text(crunch)(class
org.apache.crunch.examples.WordCount0): Job failed! 
>
>
>
>________________________________
> From: Josh Wills <jo...@gmail.com>
>To: crunch-user@incubator.apache.org; Deepak Vohra <dv...@yahoo.com> 
>Sent: Thursday, January 10, 2013 1:26:35 PM
>Subject: Re: CRUNCH-140
> 
>
>
>Does running:
>
>
>hadoop fs -put LICENSE.txt .
>
>
>followed by hadoop jar ... fix it?
>
>
>
>On Thu, Jan 10, 2013 at 12:07 PM, Deepak Vohra <dv...@yahoo.com> wrote:
>
>The command/s to test bug
>>https://issues.apache.org/jira/browse/CRUNCH-140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549900#comment-13549900
>>
>>>cd /usr/local/apache-crunch-0.4.0-incubating-bin
>>>cp LICENSE LICENSE.txt
>>>sudo mkdir crunch
>>
>>>hadoop jar crunch-examples-0.4.0-incubating-job.jar org.apache.crunch.examples.WordCount
>>LICENSE.txt crunch  
>
>
>

Re: CRUNCH-140

Posted by Josh Wills <jo...@gmail.com>.
Does running:

hadoop fs -put LICENSE.txt .

followed by hadoop jar ... fix it?


On Thu, Jan 10, 2013 at 12:07 PM, Deepak Vohra <dv...@yahoo.com> wrote:

> The command/s to test bug
>
> https://issues.apache.org/jira/browse/CRUNCH-140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549900#comment-13549900
>
> >cd /usr/local/apache-crunch-0.4.0-incubating-bin
> >cp LICENSE LICENSE.txt
> >sudo mkdir crunch
>
> >hadoop jar crunch-examples-0.4.0-incubating-job.jar
> org.apache.crunch.examples.WordCount
> LICENSE.txt crunch
>