You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Oleg Zhurakousky <ol...@gmail.com> on 2012/12/07 19:06:57 UTC

Input path with no Output path

Guys

I have a simple mapper that reads a records and sends out a message as it encounters the ones it is interested in (no reducer). So no output is ever written, but it seems like a job can not be submitted unless Output Path is specified. Not a big deal to specify a dummy one, but was wondering if it could be avoided.

Thanks
Oleg

Re: "attempt*" directories in user logs

Posted by Hemanth Yamijala <yh...@thoughtworks.com>.
However, in the case Oleg is talking about the attempts are:
attempt_201212051224_0021_m_000000_0
attempt_201212051224_0021_m_000002_0
attempt_201212051224_0021_m_000003_0

These aren't multiple attempts of a single task, are they ? They are
actually different tasks. If they were multiple attempts, I would expect
the last digit to get incremented, like attempt_201212051224_0021_m_000000_0
and attempt_201212051224_0021_m_000000_1, for instance.

It looks like at least 3 different tasks were launched on this node. One of
them could be setup task. Oleg, how many map tasks does the Jobtracker UI
show for this job.

Thanks
hemanth


On Tue, Dec 11, 2012 at 12:19 AM, Vinod Kumar Vavilapalli <
vinodkv@hortonworks.com> wrote:

>
> MR launches multiple attempts for single Task in case of TaskAttempt
> failures or when speculative execution is turned on. In either case, a
> given Task will only ever have one successful TaskAttempt whose output will
> be accepted (committed).
>
> Number of reduces is set to 1 by default in mapred-default.xml - you
> should explicitly set it to zero if you don't want reducers.
>
> By master, I suppose you mean JobTracker. JobTracker doesn't show all the
> attempts for a given Task, you should navigate to per-task page to see that.
>
>
> Thanks,
> +Vinod Kumar Vavilapalli
> Hortonworks Inc.
> http://hortonworks.com/
>
> On Dec 9, 2012, at 6:53 AM, Oleg Zhurakousky wrote:
>
> I studying user logs on the two node cluster that I have setup and I was
> wondering if anyone can shed some light on these "attempt*' directories
>
> $ ls
>
> attempt_201212051224_0021_m_000000_0  attempt_201212051224_0021_m_000003_0
>  job-acls.xml
> attempt_201212051224_0021_m_000002_0  attempt_201212051224_0021_r_000000_0
>
> I mean its obvious that its talking about 3 attempts for Map task and 1
> attempt for reduce task. However my current MR job only results in some
> output written to "attempt_201212051224_0021_m_000000_0". Nothing is the
> reduce part (understandably since I don't even have a reducer, so my
> question is:
>
> 1. The two more M attempts. . . what are they?
> 2. Why was there an attempt to do a Reduce when no reducer was
> provided.implemented
> 3. Why my master node only had 1 attempt for M task but the slave had all
> that's displayed and questioned above (the 'ls' output above is from the
> slave node)
>
> Thanks
> Oleg
>
>
>

Re: "attempt*" directories in user logs

Posted by Hemanth Yamijala <yh...@thoughtworks.com>.
However, in the case Oleg is talking about the attempts are:
attempt_201212051224_0021_m_000000_0
attempt_201212051224_0021_m_000002_0
attempt_201212051224_0021_m_000003_0

These aren't multiple attempts of a single task, are they ? They are
actually different tasks. If they were multiple attempts, I would expect
the last digit to get incremented, like attempt_201212051224_0021_m_000000_0
and attempt_201212051224_0021_m_000000_1, for instance.

It looks like at least 3 different tasks were launched on this node. One of
them could be setup task. Oleg, how many map tasks does the Jobtracker UI
show for this job.

Thanks
hemanth


On Tue, Dec 11, 2012 at 12:19 AM, Vinod Kumar Vavilapalli <
vinodkv@hortonworks.com> wrote:

>
> MR launches multiple attempts for single Task in case of TaskAttempt
> failures or when speculative execution is turned on. In either case, a
> given Task will only ever have one successful TaskAttempt whose output will
> be accepted (committed).
>
> Number of reduces is set to 1 by default in mapred-default.xml - you
> should explicitly set it to zero if you don't want reducers.
>
> By master, I suppose you mean JobTracker. JobTracker doesn't show all the
> attempts for a given Task, you should navigate to per-task page to see that.
>
>
> Thanks,
> +Vinod Kumar Vavilapalli
> Hortonworks Inc.
> http://hortonworks.com/
>
> On Dec 9, 2012, at 6:53 AM, Oleg Zhurakousky wrote:
>
> I studying user logs on the two node cluster that I have setup and I was
> wondering if anyone can shed some light on these "attempt*' directories
>
> $ ls
>
> attempt_201212051224_0021_m_000000_0  attempt_201212051224_0021_m_000003_0
>  job-acls.xml
> attempt_201212051224_0021_m_000002_0  attempt_201212051224_0021_r_000000_0
>
> I mean its obvious that its talking about 3 attempts for Map task and 1
> attempt for reduce task. However my current MR job only results in some
> output written to "attempt_201212051224_0021_m_000000_0". Nothing is the
> reduce part (understandably since I don't even have a reducer, so my
> question is:
>
> 1. The two more M attempts. . . what are they?
> 2. Why was there an attempt to do a Reduce when no reducer was
> provided.implemented
> 3. Why my master node only had 1 attempt for M task but the slave had all
> that's displayed and questioned above (the 'ls' output above is from the
> slave node)
>
> Thanks
> Oleg
>
>
>

Re: "attempt*" directories in user logs

Posted by Hemanth Yamijala <yh...@thoughtworks.com>.
However, in the case Oleg is talking about the attempts are:
attempt_201212051224_0021_m_000000_0
attempt_201212051224_0021_m_000002_0
attempt_201212051224_0021_m_000003_0

These aren't multiple attempts of a single task, are they ? They are
actually different tasks. If they were multiple attempts, I would expect
the last digit to get incremented, like attempt_201212051224_0021_m_000000_0
and attempt_201212051224_0021_m_000000_1, for instance.

It looks like at least 3 different tasks were launched on this node. One of
them could be setup task. Oleg, how many map tasks does the Jobtracker UI
show for this job.

Thanks
hemanth


On Tue, Dec 11, 2012 at 12:19 AM, Vinod Kumar Vavilapalli <
vinodkv@hortonworks.com> wrote:

>
> MR launches multiple attempts for single Task in case of TaskAttempt
> failures or when speculative execution is turned on. In either case, a
> given Task will only ever have one successful TaskAttempt whose output will
> be accepted (committed).
>
> Number of reduces is set to 1 by default in mapred-default.xml - you
> should explicitly set it to zero if you don't want reducers.
>
> By master, I suppose you mean JobTracker. JobTracker doesn't show all the
> attempts for a given Task, you should navigate to per-task page to see that.
>
>
> Thanks,
> +Vinod Kumar Vavilapalli
> Hortonworks Inc.
> http://hortonworks.com/
>
> On Dec 9, 2012, at 6:53 AM, Oleg Zhurakousky wrote:
>
> I studying user logs on the two node cluster that I have setup and I was
> wondering if anyone can shed some light on these "attempt*' directories
>
> $ ls
>
> attempt_201212051224_0021_m_000000_0  attempt_201212051224_0021_m_000003_0
>  job-acls.xml
> attempt_201212051224_0021_m_000002_0  attempt_201212051224_0021_r_000000_0
>
> I mean its obvious that its talking about 3 attempts for Map task and 1
> attempt for reduce task. However my current MR job only results in some
> output written to "attempt_201212051224_0021_m_000000_0". Nothing is the
> reduce part (understandably since I don't even have a reducer, so my
> question is:
>
> 1. The two more M attempts. . . what are they?
> 2. Why was there an attempt to do a Reduce when no reducer was
> provided.implemented
> 3. Why my master node only had 1 attempt for M task but the slave had all
> that's displayed and questioned above (the 'ls' output above is from the
> slave node)
>
> Thanks
> Oleg
>
>
>

Re: "attempt*" directories in user logs

Posted by Hemanth Yamijala <yh...@thoughtworks.com>.
However, in the case Oleg is talking about the attempts are:
attempt_201212051224_0021_m_000000_0
attempt_201212051224_0021_m_000002_0
attempt_201212051224_0021_m_000003_0

These aren't multiple attempts of a single task, are they ? They are
actually different tasks. If they were multiple attempts, I would expect
the last digit to get incremented, like attempt_201212051224_0021_m_000000_0
and attempt_201212051224_0021_m_000000_1, for instance.

It looks like at least 3 different tasks were launched on this node. One of
them could be setup task. Oleg, how many map tasks does the Jobtracker UI
show for this job.

Thanks
hemanth


On Tue, Dec 11, 2012 at 12:19 AM, Vinod Kumar Vavilapalli <
vinodkv@hortonworks.com> wrote:

>
> MR launches multiple attempts for single Task in case of TaskAttempt
> failures or when speculative execution is turned on. In either case, a
> given Task will only ever have one successful TaskAttempt whose output will
> be accepted (committed).
>
> Number of reduces is set to 1 by default in mapred-default.xml - you
> should explicitly set it to zero if you don't want reducers.
>
> By master, I suppose you mean JobTracker. JobTracker doesn't show all the
> attempts for a given Task, you should navigate to per-task page to see that.
>
>
> Thanks,
> +Vinod Kumar Vavilapalli
> Hortonworks Inc.
> http://hortonworks.com/
>
> On Dec 9, 2012, at 6:53 AM, Oleg Zhurakousky wrote:
>
> I studying user logs on the two node cluster that I have setup and I was
> wondering if anyone can shed some light on these "attempt*' directories
>
> $ ls
>
> attempt_201212051224_0021_m_000000_0  attempt_201212051224_0021_m_000003_0
>  job-acls.xml
> attempt_201212051224_0021_m_000002_0  attempt_201212051224_0021_r_000000_0
>
> I mean its obvious that its talking about 3 attempts for Map task and 1
> attempt for reduce task. However my current MR job only results in some
> output written to "attempt_201212051224_0021_m_000000_0". Nothing is the
> reduce part (understandably since I don't even have a reducer, so my
> question is:
>
> 1. The two more M attempts. . . what are they?
> 2. Why was there an attempt to do a Reduce when no reducer was
> provided.implemented
> 3. Why my master node only had 1 attempt for M task but the slave had all
> that's displayed and questioned above (the 'ls' output above is from the
> slave node)
>
> Thanks
> Oleg
>
>
>

Re: "attempt*" directories in user logs

Posted by Vinod Kumar Vavilapalli <vi...@hortonworks.com>.
MR launches multiple attempts for single Task in case of TaskAttempt failures or when speculative execution is turned on. In either case, a given Task will only ever have one successful TaskAttempt whose output will be accepted (committed).

Number of reduces is set to 1 by default in mapred-default.xml - you should explicitly set it to zero if you don't want reducers.

By master, I suppose you mean JobTracker. JobTracker doesn't show all the attempts for a given Task, you should navigate to per-task page to see that.


Thanks,
+Vinod Kumar Vavilapalli
Hortonworks Inc.
http://hortonworks.com/

On Dec 9, 2012, at 6:53 AM, Oleg Zhurakousky wrote:

> I studying user logs on the two node cluster that I have setup and I was wondering if anyone can shed some light on these "attempt*' directories
>> $ ls
> attempt_201212051224_0021_m_000000_0  attempt_201212051224_0021_m_000003_0  job-acls.xml
> attempt_201212051224_0021_m_000002_0  attempt_201212051224_0021_r_000000_0
> 
> I mean its obvious that its talking about 3 attempts for Map task and 1 attempt for reduce task. However my current MR job only results in some output written to "attempt_201212051224_0021_m_000000_0". Nothing is the reduce part (understandably since I don't even have a reducer, so my question is:
> 
> 1. The two more M attempts. . . what are they?
> 2. Why was there an attempt to do a Reduce when no reducer was provided.implemented
> 3. Why my master node only had 1 attempt for M task but the slave had all that's displayed and questioned above (the 'ls' output above is from the slave node)
> 
> Thanks
> Oleg


Re: "attempt*" directories in user logs

Posted by Tsuyoshi OZAWA <oz...@gmail.com>.
Hi Oleg,

Speculative tasks can be launched as TaskAttempt in MR jobs.
And, if no reducer class is set, MR launches default Reducer
class(IdentityReducer).

Thanks,
Tsuyoshi


On Sun, Dec 9, 2012 at 11:53 PM, Oleg Zhurakousky <
oleg.zhurakousky@gmail.com> wrote:

> I studying user logs on the two node cluster that I have setup and I was
> wondering if anyone can shed some light on these "attempt*' directories
> >$ ls
> attempt_201212051224_0021_m_000000_0  attempt_201212051224_0021_m_000003_0
>  job-acls.xml
> attempt_201212051224_0021_m_000002_0  attempt_201212051224_0021_r_000000_0
>
> I mean its obvious that its talking about 3 attempts for Map task and 1
> attempt for reduce task. However my current MR job only results in some
> output written to "attempt_201212051224_0021_m_000000_0". Nothing is the
> reduce part (understandably since I don't even have a reducer, so my
> question is:
>
> 1. The two more M attempts. . . what are they?
> 2. Why was there an attempt to do a Reduce when no reducer was
> provided.implemented
> 3. Why my master node only had 1 attempt for M task but the slave had all
> that's displayed and questioned above (the 'ls' output above is from the
> slave node)
>
> Thanks
> Oleg
>




-- 
OZAWA Tsuyoshi

Re: "attempt*" directories in user logs

Posted by Tsuyoshi OZAWA <oz...@gmail.com>.
Hi Oleg,

Speculative tasks can be launched as TaskAttempt in MR jobs.
And, if no reducer class is set, MR launches default Reducer
class(IdentityReducer).

Thanks,
Tsuyoshi


On Sun, Dec 9, 2012 at 11:53 PM, Oleg Zhurakousky <
oleg.zhurakousky@gmail.com> wrote:

> I studying user logs on the two node cluster that I have setup and I was
> wondering if anyone can shed some light on these "attempt*' directories
> >$ ls
> attempt_201212051224_0021_m_000000_0  attempt_201212051224_0021_m_000003_0
>  job-acls.xml
> attempt_201212051224_0021_m_000002_0  attempt_201212051224_0021_r_000000_0
>
> I mean its obvious that its talking about 3 attempts for Map task and 1
> attempt for reduce task. However my current MR job only results in some
> output written to "attempt_201212051224_0021_m_000000_0". Nothing is the
> reduce part (understandably since I don't even have a reducer, so my
> question is:
>
> 1. The two more M attempts. . . what are they?
> 2. Why was there an attempt to do a Reduce when no reducer was
> provided.implemented
> 3. Why my master node only had 1 attempt for M task but the slave had all
> that's displayed and questioned above (the 'ls' output above is from the
> slave node)
>
> Thanks
> Oleg
>




-- 
OZAWA Tsuyoshi

Re: "attempt*" directories in user logs

Posted by Vinod Kumar Vavilapalli <vi...@hortonworks.com>.
MR launches multiple attempts for single Task in case of TaskAttempt failures or when speculative execution is turned on. In either case, a given Task will only ever have one successful TaskAttempt whose output will be accepted (committed).

Number of reduces is set to 1 by default in mapred-default.xml - you should explicitly set it to zero if you don't want reducers.

By master, I suppose you mean JobTracker. JobTracker doesn't show all the attempts for a given Task, you should navigate to per-task page to see that.


Thanks,
+Vinod Kumar Vavilapalli
Hortonworks Inc.
http://hortonworks.com/

On Dec 9, 2012, at 6:53 AM, Oleg Zhurakousky wrote:

> I studying user logs on the two node cluster that I have setup and I was wondering if anyone can shed some light on these "attempt*' directories
>> $ ls
> attempt_201212051224_0021_m_000000_0  attempt_201212051224_0021_m_000003_0  job-acls.xml
> attempt_201212051224_0021_m_000002_0  attempt_201212051224_0021_r_000000_0
> 
> I mean its obvious that its talking about 3 attempts for Map task and 1 attempt for reduce task. However my current MR job only results in some output written to "attempt_201212051224_0021_m_000000_0". Nothing is the reduce part (understandably since I don't even have a reducer, so my question is:
> 
> 1. The two more M attempts. . . what are they?
> 2. Why was there an attempt to do a Reduce when no reducer was provided.implemented
> 3. Why my master node only had 1 attempt for M task but the slave had all that's displayed and questioned above (the 'ls' output above is from the slave node)
> 
> Thanks
> Oleg


Re: "attempt*" directories in user logs

Posted by Tsuyoshi OZAWA <oz...@gmail.com>.
Hi Oleg,

Speculative tasks can be launched as TaskAttempt in MR jobs.
And, if no reducer class is set, MR launches default Reducer
class(IdentityReducer).

Thanks,
Tsuyoshi


On Sun, Dec 9, 2012 at 11:53 PM, Oleg Zhurakousky <
oleg.zhurakousky@gmail.com> wrote:

> I studying user logs on the two node cluster that I have setup and I was
> wondering if anyone can shed some light on these "attempt*' directories
> >$ ls
> attempt_201212051224_0021_m_000000_0  attempt_201212051224_0021_m_000003_0
>  job-acls.xml
> attempt_201212051224_0021_m_000002_0  attempt_201212051224_0021_r_000000_0
>
> I mean its obvious that its talking about 3 attempts for Map task and 1
> attempt for reduce task. However my current MR job only results in some
> output written to "attempt_201212051224_0021_m_000000_0". Nothing is the
> reduce part (understandably since I don't even have a reducer, so my
> question is:
>
> 1. The two more M attempts. . . what are they?
> 2. Why was there an attempt to do a Reduce when no reducer was
> provided.implemented
> 3. Why my master node only had 1 attempt for M task but the slave had all
> that's displayed and questioned above (the 'ls' output above is from the
> slave node)
>
> Thanks
> Oleg
>




-- 
OZAWA Tsuyoshi

Re: "attempt*" directories in user logs

Posted by Vinod Kumar Vavilapalli <vi...@hortonworks.com>.
MR launches multiple attempts for single Task in case of TaskAttempt failures or when speculative execution is turned on. In either case, a given Task will only ever have one successful TaskAttempt whose output will be accepted (committed).

Number of reduces is set to 1 by default in mapred-default.xml - you should explicitly set it to zero if you don't want reducers.

By master, I suppose you mean JobTracker. JobTracker doesn't show all the attempts for a given Task, you should navigate to per-task page to see that.


Thanks,
+Vinod Kumar Vavilapalli
Hortonworks Inc.
http://hortonworks.com/

On Dec 9, 2012, at 6:53 AM, Oleg Zhurakousky wrote:

> I studying user logs on the two node cluster that I have setup and I was wondering if anyone can shed some light on these "attempt*' directories
>> $ ls
> attempt_201212051224_0021_m_000000_0  attempt_201212051224_0021_m_000003_0  job-acls.xml
> attempt_201212051224_0021_m_000002_0  attempt_201212051224_0021_r_000000_0
> 
> I mean its obvious that its talking about 3 attempts for Map task and 1 attempt for reduce task. However my current MR job only results in some output written to "attempt_201212051224_0021_m_000000_0". Nothing is the reduce part (understandably since I don't even have a reducer, so my question is:
> 
> 1. The two more M attempts. . . what are they?
> 2. Why was there an attempt to do a Reduce when no reducer was provided.implemented
> 3. Why my master node only had 1 attempt for M task but the slave had all that's displayed and questioned above (the 'ls' output above is from the slave node)
> 
> Thanks
> Oleg


Re: "attempt*" directories in user logs

Posted by Tsuyoshi OZAWA <oz...@gmail.com>.
Hi Oleg,

Speculative tasks can be launched as TaskAttempt in MR jobs.
And, if no reducer class is set, MR launches default Reducer
class(IdentityReducer).

Thanks,
Tsuyoshi


On Sun, Dec 9, 2012 at 11:53 PM, Oleg Zhurakousky <
oleg.zhurakousky@gmail.com> wrote:

> I studying user logs on the two node cluster that I have setup and I was
> wondering if anyone can shed some light on these "attempt*' directories
> >$ ls
> attempt_201212051224_0021_m_000000_0  attempt_201212051224_0021_m_000003_0
>  job-acls.xml
> attempt_201212051224_0021_m_000002_0  attempt_201212051224_0021_r_000000_0
>
> I mean its obvious that its talking about 3 attempts for Map task and 1
> attempt for reduce task. However my current MR job only results in some
> output written to "attempt_201212051224_0021_m_000000_0". Nothing is the
> reduce part (understandably since I don't even have a reducer, so my
> question is:
>
> 1. The two more M attempts. . . what are they?
> 2. Why was there an attempt to do a Reduce when no reducer was
> provided.implemented
> 3. Why my master node only had 1 attempt for M task but the slave had all
> that's displayed and questioned above (the 'ls' output above is from the
> slave node)
>
> Thanks
> Oleg
>




-- 
OZAWA Tsuyoshi

Re: "attempt*" directories in user logs

Posted by Vinod Kumar Vavilapalli <vi...@hortonworks.com>.
MR launches multiple attempts for single Task in case of TaskAttempt failures or when speculative execution is turned on. In either case, a given Task will only ever have one successful TaskAttempt whose output will be accepted (committed).

Number of reduces is set to 1 by default in mapred-default.xml - you should explicitly set it to zero if you don't want reducers.

By master, I suppose you mean JobTracker. JobTracker doesn't show all the attempts for a given Task, you should navigate to per-task page to see that.


Thanks,
+Vinod Kumar Vavilapalli
Hortonworks Inc.
http://hortonworks.com/

On Dec 9, 2012, at 6:53 AM, Oleg Zhurakousky wrote:

> I studying user logs on the two node cluster that I have setup and I was wondering if anyone can shed some light on these "attempt*' directories
>> $ ls
> attempt_201212051224_0021_m_000000_0  attempt_201212051224_0021_m_000003_0  job-acls.xml
> attempt_201212051224_0021_m_000002_0  attempt_201212051224_0021_r_000000_0
> 
> I mean its obvious that its talking about 3 attempts for Map task and 1 attempt for reduce task. However my current MR job only results in some output written to "attempt_201212051224_0021_m_000000_0". Nothing is the reduce part (understandably since I don't even have a reducer, so my question is:
> 
> 1. The two more M attempts. . . what are they?
> 2. Why was there an attempt to do a Reduce when no reducer was provided.implemented
> 3. Why my master node only had 1 attempt for M task but the slave had all that's displayed and questioned above (the 'ls' output above is from the slave node)
> 
> Thanks
> Oleg


"attempt*" directories in user logs

Posted by Oleg Zhurakousky <ol...@gmail.com>.
I studying user logs on the two node cluster that I have setup and I was wondering if anyone can shed some light on these "attempt*' directories
>$ ls
attempt_201212051224_0021_m_000000_0  attempt_201212051224_0021_m_000003_0  job-acls.xml
attempt_201212051224_0021_m_000002_0  attempt_201212051224_0021_r_000000_0

I mean its obvious that its talking about 3 attempts for Map task and 1 attempt for reduce task. However my current MR job only results in some output written to "attempt_201212051224_0021_m_000000_0". Nothing is the reduce part (understandably since I don't even have a reducer, so my question is:

1. The two more M attempts. . . what are they?
2. Why was there an attempt to do a Reduce when no reducer was provided.implemented
3. Why my master node only had 1 attempt for M task but the slave had all that's displayed and questioned above (the 'ls' output above is from the slave node)

Thanks
Oleg
 

Re: Input path with no Output path

Posted by Oleg Zhurakousky <ol...@gmail.com>.
Perfect! Thanks

On Dec 7, 2012, at 1:21 PM, Peyman Mohajerian <mo...@gmail.com> wrote:

> I think this does it:
> http://hadoop.apache.org/docs/r0.20.1/api/org/apache/hadoop/mapreduce/lib/output/NullOutputFormat.html
> 
> On Fri, Dec 7, 2012 at 10:06 AM, Oleg Zhurakousky <ol...@gmail.com> wrote:
> Guys
> 
> I have a simple mapper that reads a records and sends out a message as it encounters the ones it is interested in (no reducer). So no output is ever written, but it seems like a job can not be submitted unless Output Path is specified. Not a big deal to specify a dummy one, but was wondering if it could be avoided.
> 
> Thanks
> Oleg
> 


Re: Input path with no Output path

Posted by Oleg Zhurakousky <ol...@gmail.com>.
Perfect! Thanks

On Dec 7, 2012, at 1:21 PM, Peyman Mohajerian <mo...@gmail.com> wrote:

> I think this does it:
> http://hadoop.apache.org/docs/r0.20.1/api/org/apache/hadoop/mapreduce/lib/output/NullOutputFormat.html
> 
> On Fri, Dec 7, 2012 at 10:06 AM, Oleg Zhurakousky <ol...@gmail.com> wrote:
> Guys
> 
> I have a simple mapper that reads a records and sends out a message as it encounters the ones it is interested in (no reducer). So no output is ever written, but it seems like a job can not be submitted unless Output Path is specified. Not a big deal to specify a dummy one, but was wondering if it could be avoided.
> 
> Thanks
> Oleg
> 


Re: Input path with no Output path

Posted by Oleg Zhurakousky <ol...@gmail.com>.
Perfect! Thanks

On Dec 7, 2012, at 1:21 PM, Peyman Mohajerian <mo...@gmail.com> wrote:

> I think this does it:
> http://hadoop.apache.org/docs/r0.20.1/api/org/apache/hadoop/mapreduce/lib/output/NullOutputFormat.html
> 
> On Fri, Dec 7, 2012 at 10:06 AM, Oleg Zhurakousky <ol...@gmail.com> wrote:
> Guys
> 
> I have a simple mapper that reads a records and sends out a message as it encounters the ones it is interested in (no reducer). So no output is ever written, but it seems like a job can not be submitted unless Output Path is specified. Not a big deal to specify a dummy one, but was wondering if it could be avoided.
> 
> Thanks
> Oleg
> 


"attempt*" directories in user logs

Posted by Oleg Zhurakousky <ol...@gmail.com>.
I studying user logs on the two node cluster that I have setup and I was wondering if anyone can shed some light on these "attempt*' directories
>$ ls
attempt_201212051224_0021_m_000000_0  attempt_201212051224_0021_m_000003_0  job-acls.xml
attempt_201212051224_0021_m_000002_0  attempt_201212051224_0021_r_000000_0

I mean its obvious that its talking about 3 attempts for Map task and 1 attempt for reduce task. However my current MR job only results in some output written to "attempt_201212051224_0021_m_000000_0". Nothing is the reduce part (understandably since I don't even have a reducer, so my question is:

1. The two more M attempts. . . what are they?
2. Why was there an attempt to do a Reduce when no reducer was provided.implemented
3. Why my master node only had 1 attempt for M task but the slave had all that's displayed and questioned above (the 'ls' output above is from the slave node)

Thanks
Oleg
 

"attempt*" directories in user logs

Posted by Oleg Zhurakousky <ol...@gmail.com>.
I studying user logs on the two node cluster that I have setup and I was wondering if anyone can shed some light on these "attempt*' directories
>$ ls
attempt_201212051224_0021_m_000000_0  attempt_201212051224_0021_m_000003_0  job-acls.xml
attempt_201212051224_0021_m_000002_0  attempt_201212051224_0021_r_000000_0

I mean its obvious that its talking about 3 attempts for Map task and 1 attempt for reduce task. However my current MR job only results in some output written to "attempt_201212051224_0021_m_000000_0". Nothing is the reduce part (understandably since I don't even have a reducer, so my question is:

1. The two more M attempts. . . what are they?
2. Why was there an attempt to do a Reduce when no reducer was provided.implemented
3. Why my master node only had 1 attempt for M task but the slave had all that's displayed and questioned above (the 'ls' output above is from the slave node)

Thanks
Oleg
 

Re: Input path with no Output path

Posted by Oleg Zhurakousky <ol...@gmail.com>.
Perfect! Thanks

On Dec 7, 2012, at 1:21 PM, Peyman Mohajerian <mo...@gmail.com> wrote:

> I think this does it:
> http://hadoop.apache.org/docs/r0.20.1/api/org/apache/hadoop/mapreduce/lib/output/NullOutputFormat.html
> 
> On Fri, Dec 7, 2012 at 10:06 AM, Oleg Zhurakousky <ol...@gmail.com> wrote:
> Guys
> 
> I have a simple mapper that reads a records and sends out a message as it encounters the ones it is interested in (no reducer). So no output is ever written, but it seems like a job can not be submitted unless Output Path is specified. Not a big deal to specify a dummy one, but was wondering if it could be avoided.
> 
> Thanks
> Oleg
> 


"attempt*" directories in user logs

Posted by Oleg Zhurakousky <ol...@gmail.com>.
I studying user logs on the two node cluster that I have setup and I was wondering if anyone can shed some light on these "attempt*' directories
>$ ls
attempt_201212051224_0021_m_000000_0  attempt_201212051224_0021_m_000003_0  job-acls.xml
attempt_201212051224_0021_m_000002_0  attempt_201212051224_0021_r_000000_0

I mean its obvious that its talking about 3 attempts for Map task and 1 attempt for reduce task. However my current MR job only results in some output written to "attempt_201212051224_0021_m_000000_0". Nothing is the reduce part (understandably since I don't even have a reducer, so my question is:

1. The two more M attempts. . . what are they?
2. Why was there an attempt to do a Reduce when no reducer was provided.implemented
3. Why my master node only had 1 attempt for M task but the slave had all that's displayed and questioned above (the 'ls' output above is from the slave node)

Thanks
Oleg
 

Re: Input path with no Output path

Posted by Peyman Mohajerian <mo...@gmail.com>.
I think this does it:
http://hadoop.apache.org/docs/r0.20.1/api/org/apache/hadoop/mapreduce/lib/output/NullOutputFormat.html

On Fri, Dec 7, 2012 at 10:06 AM, Oleg Zhurakousky <
oleg.zhurakousky@gmail.com> wrote:

> Guys
>
> I have a simple mapper that reads a records and sends out a message as it
> encounters the ones it is interested in (no reducer). So no output is ever
> written, but it seems like a job can not be submitted unless Output Path is
> specified. Not a big deal to specify a dummy one, but was wondering if it
> could be avoided.
>
> Thanks
> Oleg

Re: Input path with no Output path

Posted by Peyman Mohajerian <mo...@gmail.com>.
I think this does it:
http://hadoop.apache.org/docs/r0.20.1/api/org/apache/hadoop/mapreduce/lib/output/NullOutputFormat.html

On Fri, Dec 7, 2012 at 10:06 AM, Oleg Zhurakousky <
oleg.zhurakousky@gmail.com> wrote:

> Guys
>
> I have a simple mapper that reads a records and sends out a message as it
> encounters the ones it is interested in (no reducer). So no output is ever
> written, but it seems like a job can not be submitted unless Output Path is
> specified. Not a big deal to specify a dummy one, but was wondering if it
> could be avoided.
>
> Thanks
> Oleg

Re: Input path with no Output path

Posted by Peyman Mohajerian <mo...@gmail.com>.
I think this does it:
http://hadoop.apache.org/docs/r0.20.1/api/org/apache/hadoop/mapreduce/lib/output/NullOutputFormat.html

On Fri, Dec 7, 2012 at 10:06 AM, Oleg Zhurakousky <
oleg.zhurakousky@gmail.com> wrote:

> Guys
>
> I have a simple mapper that reads a records and sends out a message as it
> encounters the ones it is interested in (no reducer). So no output is ever
> written, but it seems like a job can not be submitted unless Output Path is
> specified. Not a big deal to specify a dummy one, but was wondering if it
> could be avoided.
>
> Thanks
> Oleg

Re: Input path with no Output path

Posted by Peyman Mohajerian <mo...@gmail.com>.
I think this does it:
http://hadoop.apache.org/docs/r0.20.1/api/org/apache/hadoop/mapreduce/lib/output/NullOutputFormat.html

On Fri, Dec 7, 2012 at 10:06 AM, Oleg Zhurakousky <
oleg.zhurakousky@gmail.com> wrote:

> Guys
>
> I have a simple mapper that reads a records and sends out a message as it
> encounters the ones it is interested in (no reducer). So no output is ever
> written, but it seems like a job can not be submitted unless Output Path is
> specified. Not a big deal to specify a dummy one, but was wondering if it
> could be avoided.
>
> Thanks
> Oleg