You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-user@hadoop.apache.org by Bejoy KS <be...@gmail.com> on 2011/09/07 09:48:53 UTC

No Mapper but Reducer

Hi
          I'm having a query here. Is it possible to have no mappers but
reducers alone? AFAIK If we need to avoid the tyriggering of reducers we can
set numReduceTasks to zero but such a setting on mapper wont work. So how
can it be achieved if possible?

Thank You

Regards
Bejoy.K.S

Re: No Mapper but Reducer

Posted by Harsh J <ha...@cloudera.com>.

Nope. A reducer's input is from the map outputs alone (fetched in by
the shuffling code), which would not exist here.

What are you looking to do? Why won't a map task suffice for doing that?

On Wed, Sep 7, 2011 at 4:51 PM, Bejoy KS <be...@gmail.com> wrote:
> Thank You All. Even I have noticed this strange behavior some time back.
> Now my inital concern still remains.  If I provide my input directory an
> empty one, yes the map tasks wont be executed .But my reducer needs  input
> to do the processing/ aggregation. In such a scenario, is there an option to
> provide input just to the reducer?
>
> Regards
> Bejoy.K.S
>
> On Wed, Sep 7, 2011 at 3:09 PM, Sudharsan Sampath <su...@gmail.com>
> wrote:
>>
>> This is true and it took as off by surprise in recent past. Also, it had
>> quite some impact on our job cycles where the size of input is totally
>> random and could also be zero at times.
>> In one of our cycles, we run a lot of jobs. Say we configure X as the num
>> of reducers for a job which does not have any input.
>> Y -> No of tasktrackers in the cluster
>> H -> Time Interval for Heartbeat response
>> With the cdh2 version, the job takes,
>> ( X / Y) * H seconds to complete without doing any work since we assign
>> only one reduce task per heartbeat
>>
>> If the number of such jobs in the cycle is more, then the total time that
>> the cluster spends doing nothing accumulates.
>> I was thinking of raising this as a jira but not sure. Should we raise and
>> fix this as jira request? Num of reducers set by the client can be overriden
>> if the number of mappers is 0?
>> We have a way to hack, by verifying the existence of the input path to the
>> Map phase ourselves but just thought would be more intuitive for the
>> framework to handle itself
>> -Sudhan S
>> On Wed, Sep 7, 2011 at 2:25 PM, Harsh J <ha...@cloudera.com> wrote:
>>>
>>> Oh boy are you in for a surprise. Reducers _can_ run with 0 mappers in a
>>> job ;-)
>>>
>>> /me puts his troll-mask on.
>>>
>>> ➜  ~HADOOP_HOME  hadoop fs -mkdir abc
>>> ➜  ~HADOOP_HOME  hadoop jar hadoop-examples-0.20.2-cdh3u1.jar wordcount
>>> abc out
>>> 11/09/07 14:24:14 INFO input.FileInputFormat: Total input paths to
>>> process : 0
>>> 11/09/07 14:24:14 INFO mapred.JobClient: Running job:
>>> job_201109071413_0001
>>> 11/09/07 14:24:15 INFO mapred.JobClient:  map 0% reduce 0%
>>> 11/09/07 14:24:21 INFO mapred.JobClient:  map 0% reduce 100%
>>> 11/09/07 14:24:22 INFO mapred.JobClient: Job complete:
>>> job_201109071413_0001
>>> 11/09/07 14:24:22 INFO mapred.JobClient: Counters: 13
>>> 11/09/07 14:24:22 INFO mapred.JobClient:   Job Counters
>>> 11/09/07 14:24:22 INFO mapred.JobClient:     Launched reduce tasks=1
>>> 11/09/07 14:24:22 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=2209
>>> 11/09/07 14:24:22 INFO mapred.JobClient:     Total time spent by all
>>> reduces waiting after reserving slots (ms)=0
>>> 11/09/07 14:24:22 INFO mapred.JobClient:     Total time spent by all
>>> maps waiting after reserving slots (ms)=0
>>> 11/09/07 14:24:22 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=3113
>>> 11/09/07 14:24:22 INFO mapred.JobClient:   FileSystemCounters
>>> 11/09/07 14:24:22 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=59220
>>> 11/09/07 14:24:22 INFO mapred.JobClient:   Map-Reduce Framework
>>> 11/09/07 14:24:22 INFO mapred.JobClient:     Reduce input groups=0
>>> 11/09/07 14:24:22 INFO mapred.JobClient:     Combine output records=0
>>> 11/09/07 14:24:22 INFO mapred.JobClient:     Reduce shuffle bytes=0
>>> 11/09/07 14:24:22 INFO mapred.JobClient:     Reduce output records=0
>>> 11/09/07 14:24:22 INFO mapred.JobClient:     Spilled Records=0
>>> 11/09/07 14:24:22 INFO mapred.JobClient:     Combine input records=0
>>> 11/09/07 14:24:22 INFO mapred.JobClient:     Reduce input records=0
>>>
>>> /me takes off troll mask.
>>>
>>> On Wed, Sep 7, 2011 at 1:30 PM, Bejoy KS <be...@gmail.com> wrote:
>>> > Thanks Sonal. I was just thinking of some weird design and wanted to
>>> > make
>>> > sure whether there is a possibility like that- no maps and all
>>> > reducers.
>>> >
>>> > On Wed, Sep 7, 2011 at 1:22 PM, Sonal Goyal <so...@gmail.com>
>>> > wrote:
>>> >>
>>> >> I dont think that is possible, can you explain in what scenario you
>>> >> want
>>> >> to have no mappers, only reducers?
>>> >> Best Regards,
>>> >> Sonal
>>> >> Crux: Reporting for HBase
>>> >> Nube Technologies
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> On Wed, Sep 7, 2011 at 1:18 PM, Bejoy KS <be...@gmail.com>
>>> >> wrote:
>>> >>>
>>> >>> Hi
>>> >>>           I'm having a query here. Is it possible to have no mappers
>>> >>> but
>>> >>> reducers alone? AFAIK If we need to avoid the tyriggering of reducers
>>> >>> we can
>>> >>> set numReduceTasks to zero but such a setting on mapper wont work. So
>>> >>> how
>>> >>> can it be achieved if possible?
>>> >>>
>>> >>> Thank You
>>> >>>
>>> >>> Regards
>>> >>> Bejoy.K.S
>>> >>
>>> >
>>> >
>>>
>>>
>>>
>>> --
>>> Harsh J
>>
>
>



-- 
Harsh J

RE: No Mapper but Reducer

Posted by "GOEKE, MATTHEW (AG/1000)" <ma...@monsanto.com>.

Your last question is not as straight forward and would be better answered by running it on your own cluster and looking at the job tracker history. Data skew and partitioning, map and reduce slots available, mapred.reduce.slowstart.completed.maps, and several other things have the potential to affect this distribution.

Matt

From: Bejoy KS [mailto:bejoy.hadoop@gmail.com]
Sent: Thursday, September 08, 2011 1:10 AM
To: mapreduce-user@hadoop.apache.org
Subject: Re: No Mapper but Reducer

Exactly Matthew, The weird thought was in that direction. Basically i do have a tilde separated input which has to undergo some aggregation operation. So I was just giving a shot to see if there is a possibility to run directly into Sort Shuffle phase directly and then the reducer without a mapper. I know I need to need at least depend on IdentityMapper.
                 A small query on top of this. If we take a basic map reduce job, say word count without a combiner. What would the percentage distribution of execution time on map, reduce and the sort shuffle phase?

On Wed, Sep 7, 2011 at 10:30 PM, GOEKE, MATTHEW (AG/1000) <ma...@monsanto.com>> wrote:
Bejoy,

What exactly is your use case? I know down below you said you were just thinking of a weird design but it would really help if we knew exactly what you were shooting for because we might be able to refactor it.

I have a job that I developed that still required the input to be sorted for the reduce but I did not need to do any transformation or filtering in the map side so I just did an identity mapper, as Robert mentions below this, and it works perfectly. I do not think that there is any way to pass data directly into the S/S phase without going through the map phase (if that is what you were hinting at) and if you don’t require the data to go through S/S then you can make it a map only job.

Matt

From: Robert Hafner [mailto:tedivm@tedivm.com<ma...@tedivm.com>]
Sent: Wednesday, September 07, 2011 11:34 AM

To: mapreduce-user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: No Mapper but Reducer

You could just have a mapper which sent off the exact values it took in (ie, output k1,v1 as k2,v2). I think that's the best you'll be able to do here.

On Sep 7, 2011, at 4:21 AM, Bejoy KS <be...@gmail.com>> wrote:
Thank You All. Even I have noticed this strange behavior some time back.
Now my inital concern still remains.  If I provide my input directory an empty one, yes the map tasks wont be executed .But my reducer needs  input to do the processing/ aggregation. In such a scenario, is there an option to provide input just to the reducer?

Regards
Bejoy.K.S
On Wed, Sep 7, 2011 at 3:09 PM, Sudharsan Sampath <su...@gmail.com>> wrote:
This is true and it took as off by surprise in recent past. Also, it had quite some impact on our job cycles where the size of input is totally random and could also be zero at times.

In one of our cycles, we run a lot of jobs. Say we configure X as the num of reducers for a job which does not have any input.

Y -> No of tasktrackers in the cluster

H -> Time Interval for Heartbeat response

With the cdh2 version, the job takes,

( X / Y) * H seconds to complete without doing any work since we assign only one reduce task per heartbeat

If the number of such jobs in the cycle is more, then the total time that the cluster spends doing nothing accumulates.

I was thinking of raising this as a jira but not sure. Should we raise and fix this as jira request? Num of reducers set by the client can be overriden if the number of mappers is 0?

We have a way to hack, by verifying the existence of the input path to the Map phase ourselves but just thought would be more intuitive for the framework to handle itself

-Sudhan S

On Wed, Sep 7, 2011 at 2:25 PM, Harsh J <ha...@cloudera.com>> wrote:
Oh boy are you in for a surprise. Reducers _can_ run with 0 mappers in a job ;-)

/me puts his troll-mask on.

➜  ~HADOOP_HOME  hadoop fs -mkdir abc
➜  ~HADOOP_HOME  hadoop jar hadoop-examples-0.20.2-cdh3u1.jar wordcount abc out
11/09/07 14:24:14 INFO input.FileInputFormat: Total input paths to process : 0
11/09/07 14:24:14 INFO mapred.JobClient: Running job: job_201109071413_0001
11/09/07 14:24:15 INFO mapred.JobClient:  map 0% reduce 0%
11/09/07 14:24:21 INFO mapred.JobClient:  map 0% reduce 100%
11/09/07 14:24:22 INFO mapred.JobClient: Job complete: job_201109071413_0001
11/09/07 14:24:22 INFO mapred.JobClient: Counters: 13
11/09/07 14:24:22 INFO mapred.JobClient:   Job Counters
11/09/07 14:24:22 INFO mapred.JobClient:     Launched reduce tasks=1
11/09/07 14:24:22 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=2209
11/09/07 14:24:22 INFO mapred.JobClient:     Total time spent by all
reduces waiting after reserving slots (ms)=0
11/09/07 14:24:22 INFO mapred.JobClient:     Total time spent by all
maps waiting after reserving slots (ms)=0
11/09/07 14:24:22 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=3113
11/09/07 14:24:22 INFO mapred.JobClient:   FileSystemCounters
11/09/07 14:24:22 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=59220
11/09/07 14:24:22 INFO mapred.JobClient:   Map-Reduce Framework
11/09/07 14:24:22 INFO mapred.JobClient:     Reduce input groups=0
11/09/07 14:24:22 INFO mapred.JobClient:     Combine output records=0
11/09/07 14:24:22 INFO mapred.JobClient:     Reduce shuffle bytes=0
11/09/07 14:24:22 INFO mapred.JobClient:     Reduce output records=0
11/09/07 14:24:22 INFO mapred.JobClient:     Spilled Records=0
11/09/07 14:24:22 INFO mapred.JobClient:     Combine input records=0
11/09/07 14:24:22 INFO mapred.JobClient:     Reduce input records=0

/me takes off troll mask.

On Wed, Sep 7, 2011 at 1:30 PM, Bejoy KS <be...@gmail.com>> wrote:
> Thanks Sonal. I was just thinking of some weird design and wanted to make
> sure whether there is a possibility like that- no maps and all reducers.
>
> On Wed, Sep 7, 2011 at 1:22 PM, Sonal Goyal <so...@gmail.com>> wrote:
>>
>> I dont think that is possible, can you explain in what scenario you want
>> to have no mappers, only reducers?
>> Best Regards,
>> Sonal
>> Crux: Reporting for HBase
>> Nube Technologies
>>
>>
>>
>>
>>
>>
>>
>> On Wed, Sep 7, 2011 at 1:18 PM, Bejoy KS <be...@gmail.com>> wrote:
>>>
>>> Hi
>>>           I'm having a query here. Is it possible to have no mappers but
>>> reducers alone? AFAIK If we need to avoid the tyriggering of reducers we can
>>> set numReduceTasks to zero but such a setting on mapper wont work. So how
>>> can it be achieved if possible?
>>>
>>> Thank You
>>>
>>> Regards
>>> Bejoy.K.S
>>
>
>

--
Harsh J

This e-mail message may contain privileged and/or confidential information, and is intended to be received only by persons entitled
to receive such information. If you have received this e-mail in error, please notify the sender immediately. Please delete it and
all attachments from any servers, hard drives or any other media. Other use of this e-mail by you is strictly prohibited.

All e-mails and attachments sent and received are subject to monitoring, reading and archival by Monsanto, including its
subsidiaries. The recipient of this e-mail is solely responsible for checking for the presence of "Viruses" or other "Malware".
Monsanto, along with its subsidiaries, accepts no liability for any damage caused by any such code transmitted by or accompanying
this e-mail or any attachment.

The information contained in this email may be subject to the export control laws and regulations of the United States, potentially
including but not limited to the Export Administration Regulations (EAR) and sanctions regulations issued by the U.S. Department of
Treasury, Office of Foreign Asset Controls (OFAC).  As a recipient of this information you are obligated to comply with all
applicable U.S. export laws and regulations.

Re: No Mapper but Reducer

Posted by Bejoy KS <be...@gmail.com>.

Exactly Matthew, The weird thought was in that direction. Basically i do
have a tilde separated input which has to undergo some aggregation
operation. So I was just giving a shot to see if there is a possibility to
run directly into Sort Shuffle phase directly and then the reducer without a
mapper. I know I need to need at least depend on IdentityMapper.
                 A small query on top of this. If we take a basic map reduce
job, say word count without a combiner. What would the percentage
distribution of execution time on map, reduce and the sort shuffle phase?


On Wed, Sep 7, 2011 at 10:30 PM, GOEKE, MATTHEW (AG/1000) <
matthew.goeke@monsanto.com> wrote:

>  Bejoy,****
>
> ** **
>
> What exactly is your use case? I know down below you said you were just
> thinking of a weird design but it would really help if we knew exactly what
> you were shooting for because we might be able to refactor it.****
>
> ** **
>
> I have a job that I developed that still required the input to be sorted
> for the reduce but I did not need to do any transformation or filtering in
> the map side so I just did an identity mapper, as Robert mentions below
> this, and it works perfectly. I do not think that there is any way to pass
> data directly into the S/S phase without going through the map phase (if
> that is what you were hinting at) and if you don’t require the data to go
> through S/S then you can make it a map only job.****
>
> ** **
>
> Matt****
>
> ** **
>
> *From:* Robert Hafner [mailto:tedivm@tedivm.com]
> *Sent:* Wednesday, September 07, 2011 11:34 AM
>
> *To:* mapreduce-user@hadoop.apache.org
> *Subject:* Re: No Mapper but Reducer****
>
>  ** **
>
> ** **
>
> You could just have a mapper which sent off the exact values it took in
> (ie, output k1,v1 as k2,v2). I think that's the best you'll be able to do
> here.
>
> ****
>
>
> On Sep 7, 2011, at 4:21 AM, Bejoy KS <be...@gmail.com> wrote:****
>
>  Thank You All. Even I have noticed this strange behavior some time back.
> Now my inital concern still remains.  If I provide my input directory an
> empty one, yes the map tasks wont be executed .But my reducer needs  input
> to do the processing/ aggregation. In such a scenario, is there an option to
> provide input just to the reducer?
>
> Regards
> Bejoy.K.S****
>
> On Wed, Sep 7, 2011 at 3:09 PM, Sudharsan Sampath <su...@gmail.com>
> wrote:****
>
> This is true and it took as off by surprise in recent past. Also, it had
> quite some impact on our job cycles where the size of input is totally
> random and could also be zero at times. ****
>
> ** **
>
> In one of our cycles, we run a lot of jobs. Say we configure X as the num
> of reducers for a job which does not have any input.****
>
> ** **
>
> Y -> No of tasktrackers in the cluster****
>
> ** **
>
> H -> Time Interval for Heartbeat response****
>
> ** **
>
> With the cdh2 version, the job takes, ****
>
> ** **
>
> ( X / Y) * H seconds to complete without doing any work since we assign
> only one reduce task per heartbeat****
>
> ** **
>
> ** **
>
> If the number of such jobs in the cycle is more, then the total time that
> the cluster spends doing nothing accumulates.****
>
> ** **
>
> I was thinking of raising this as a jira but not sure. Should we raise and
> fix this as jira request? Num of reducers set by the client can be overriden
> if the number of mappers is 0?****
>
> ** **
>
> We have a way to hack, by verifying the existence of the input path to the
> Map phase ourselves but just thought would be more intuitive for the
> framework to handle itself****
>
> ** **
>
> -Sudhan S****
>
> ** **
>
> On Wed, Sep 7, 2011 at 2:25 PM, Harsh J <ha...@cloudera.com> wrote:****
>
> Oh boy are you in for a surprise. Reducers _can_ run with 0 mappers in a
> job ;-)
>
> /me puts his troll-mask on.
>
> ➜  ~HADOOP_HOME  hadoop fs -mkdir abc
> ➜  ~HADOOP_HOME  hadoop jar hadoop-examples-0.20.2-cdh3u1.jar wordcount
> abc out
> 11/09/07 14:24:14 INFO input.FileInputFormat: Total input paths to process
> : 0
> 11/09/07 14:24:14 INFO mapred.JobClient: Running job: job_201109071413_0001
> 11/09/07 14:24:15 INFO mapred.JobClient:  map 0% reduce 0%
> 11/09/07 14:24:21 INFO mapred.JobClient:  map 0% reduce 100%
> 11/09/07 14:24:22 INFO mapred.JobClient: Job complete:
> job_201109071413_0001
> 11/09/07 14:24:22 INFO mapred.JobClient: Counters: 13
> 11/09/07 14:24:22 INFO mapred.JobClient:   Job Counters
> 11/09/07 14:24:22 INFO mapred.JobClient:     Launched reduce tasks=1
> 11/09/07 14:24:22 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=2209
> 11/09/07 14:24:22 INFO mapred.JobClient:     Total time spent by all
> reduces waiting after reserving slots (ms)=0
> 11/09/07 14:24:22 INFO mapred.JobClient:     Total time spent by all
> maps waiting after reserving slots (ms)=0
> 11/09/07 14:24:22 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=3113
> 11/09/07 14:24:22 INFO mapred.JobClient:   FileSystemCounters
> 11/09/07 14:24:22 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=59220
> 11/09/07 14:24:22 INFO mapred.JobClient:   Map-Reduce Framework
> 11/09/07 14:24:22 INFO mapred.JobClient:     Reduce input groups=0
> 11/09/07 14:24:22 INFO mapred.JobClient:     Combine output records=0
> 11/09/07 14:24:22 INFO mapred.JobClient:     Reduce shuffle bytes=0
> 11/09/07 14:24:22 INFO mapred.JobClient:     Reduce output records=0
> 11/09/07 14:24:22 INFO mapred.JobClient:     Spilled Records=0
> 11/09/07 14:24:22 INFO mapred.JobClient:     Combine input records=0
> 11/09/07 14:24:22 INFO mapred.JobClient:     Reduce input records=0
>
> /me takes off troll mask.****
>
>
> On Wed, Sep 7, 2011 at 1:30 PM, Bejoy KS <be...@gmail.com> wrote:
> > Thanks Sonal. I was just thinking of some weird design and wanted to make
> > sure whether there is a possibility like that- no maps and all reducers.
> >
> > On Wed, Sep 7, 2011 at 1:22 PM, Sonal Goyal <so...@gmail.com>
> wrote:
> >>
> >> I dont think that is possible, can you explain in what scenario you want
> >> to have no mappers, only reducers?
> >> Best Regards,
> >> Sonal
> >> Crux: Reporting for HBase
> >> Nube Technologies
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> On Wed, Sep 7, 2011 at 1:18 PM, Bejoy KS <be...@gmail.com>
> wrote:
> >>>
> >>> Hi
> >>>           I'm having a query here. Is it possible to have no mappers
> but
> >>> reducers alone? AFAIK If we need to avoid the tyriggering of reducers
> we can
> >>> set numReduceTasks to zero but such a setting on mapper wont work. So
> how
> >>> can it be achieved if possible?
> >>>
> >>> Thank You
> >>>
> >>> Regards
> >>> Bejoy.K.S
> >>
> >
> >
>
>
> ****
>
> --
> Harsh J****
>
> ** **
>
> ** **
>
>  This e-mail message may contain privileged and/or confidential
> information, and is intended to be received only by persons entitled
> to receive such information. If you have received this e-mail in error,
> please notify the sender immediately. Please delete it and
> all attachments from any servers, hard drives or any other media. Other use
> of this e-mail by you is strictly prohibited.
>
> All e-mails and attachments sent and received are subject to monitoring,
> reading and archival by Monsanto, including its
> subsidiaries. The recipient of this e-mail is solely responsible for
> checking for the presence of "Viruses" or other "Malware".
> Monsanto, along with its subsidiaries, accepts no liability for any damage
> caused by any such code transmitted by or accompanying
> this e-mail or any attachment.
>
>
> The information contained in this email may be subject to the export
> control laws and regulations of the United States, potentially
> including but not limited to the Export Administration Regulations (EAR)
> and sanctions regulations issued by the U.S. Department of
> Treasury, Office of Foreign Asset Controls (OFAC).  As a recipient of this
> information you are obligated to comply with all
> applicable U.S. export laws and regulations.
>

RE: No Mapper but Reducer

Posted by "GOEKE, MATTHEW (AG/1000)" <ma...@monsanto.com>.

Bejoy,

What exactly is your use case? I know down below you said you were just thinking of a weird design but it would really help if we knew exactly what you were shooting for because we might be able to refactor it.

I have a job that I developed that still required the input to be sorted for the reduce but I did not need to do any transformation or filtering in the map side so I just did an identity mapper, as Robert mentions below this, and it works perfectly. I do not think that there is any way to pass data directly into the S/S phase without going through the map phase (if that is what you were hinting at) and if you don’t require the data to go through S/S then you can make it a map only job.

Matt

From: Robert Hafner [mailto:tedivm@tedivm.com]
Sent: Wednesday, September 07, 2011 11:34 AM
To: mapreduce-user@hadoop.apache.org
Subject: Re: No Mapper but Reducer

You could just have a mapper which sent off the exact values it took in (ie, output k1,v1 as k2,v2). I think that's the best you'll be able to do here.

On Sep 7, 2011, at 4:21 AM, Bejoy KS <be...@gmail.com>> wrote:
Thank You All. Even I have noticed this strange behavior some time back.
Now my inital concern still remains.  If I provide my input directory an empty one, yes the map tasks wont be executed .But my reducer needs  input to do the processing/ aggregation. In such a scenario, is there an option to provide input just to the reducer?

Regards
Bejoy.K.S
On Wed, Sep 7, 2011 at 3:09 PM, Sudharsan Sampath <su...@gmail.com>> wrote:
This is true and it took as off by surprise in recent past. Also, it had quite some impact on our job cycles where the size of input is totally random and could also be zero at times.

In one of our cycles, we run a lot of jobs. Say we configure X as the num of reducers for a job which does not have any input.

Y -> No of tasktrackers in the cluster

H -> Time Interval for Heartbeat response

With the cdh2 version, the job takes,

( X / Y) * H seconds to complete without doing any work since we assign only one reduce task per heartbeat

If the number of such jobs in the cycle is more, then the total time that the cluster spends doing nothing accumulates.

I was thinking of raising this as a jira but not sure. Should we raise and fix this as jira request? Num of reducers set by the client can be overriden if the number of mappers is 0?

We have a way to hack, by verifying the existence of the input path to the Map phase ourselves but just thought would be more intuitive for the framework to handle itself

-Sudhan S

On Wed, Sep 7, 2011 at 2:25 PM, Harsh J <ha...@cloudera.com>> wrote:
Oh boy are you in for a surprise. Reducers _can_ run with 0 mappers in a job ;-)

/me puts his troll-mask on.

➜  ~HADOOP_HOME  hadoop fs -mkdir abc
➜  ~HADOOP_HOME  hadoop jar hadoop-examples-0.20.2-cdh3u1.jar wordcount abc out
11/09/07 14:24:14 INFO input.FileInputFormat: Total input paths to process : 0
11/09/07 14:24:14 INFO mapred.JobClient: Running job: job_201109071413_0001
11/09/07 14:24:15 INFO mapred.JobClient:  map 0% reduce 0%
11/09/07 14:24:21 INFO mapred.JobClient:  map 0% reduce 100%
11/09/07 14:24:22 INFO mapred.JobClient: Job complete: job_201109071413_0001
11/09/07 14:24:22 INFO mapred.JobClient: Counters: 13
11/09/07 14:24:22 INFO mapred.JobClient:   Job Counters
11/09/07 14:24:22 INFO mapred.JobClient:     Launched reduce tasks=1
11/09/07 14:24:22 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=2209
11/09/07 14:24:22 INFO mapred.JobClient:     Total time spent by all
reduces waiting after reserving slots (ms)=0
11/09/07 14:24:22 INFO mapred.JobClient:     Total time spent by all
maps waiting after reserving slots (ms)=0
11/09/07 14:24:22 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=3113
11/09/07 14:24:22 INFO mapred.JobClient:   FileSystemCounters
11/09/07 14:24:22 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=59220
11/09/07 14:24:22 INFO mapred.JobClient:   Map-Reduce Framework
11/09/07 14:24:22 INFO mapred.JobClient:     Reduce input groups=0
11/09/07 14:24:22 INFO mapred.JobClient:     Combine output records=0
11/09/07 14:24:22 INFO mapred.JobClient:     Reduce shuffle bytes=0
11/09/07 14:24:22 INFO mapred.JobClient:     Reduce output records=0
11/09/07 14:24:22 INFO mapred.JobClient:     Spilled Records=0
11/09/07 14:24:22 INFO mapred.JobClient:     Combine input records=0
11/09/07 14:24:22 INFO mapred.JobClient:     Reduce input records=0

/me takes off troll mask.

On Wed, Sep 7, 2011 at 1:30 PM, Bejoy KS <be...@gmail.com>> wrote:
> Thanks Sonal. I was just thinking of some weird design and wanted to make
> sure whether there is a possibility like that- no maps and all reducers.
>
> On Wed, Sep 7, 2011 at 1:22 PM, Sonal Goyal <so...@gmail.com>> wrote:
>>
>> I dont think that is possible, can you explain in what scenario you want
>> to have no mappers, only reducers?
>> Best Regards,
>> Sonal
>> Crux: Reporting for HBase
>> Nube Technologies
>>
>>
>>
>>
>>
>>
>>
>> On Wed, Sep 7, 2011 at 1:18 PM, Bejoy KS <be...@gmail.com>> wrote:
>>>
>>> Hi
>>>           I'm having a query here. Is it possible to have no mappers but
>>> reducers alone? AFAIK If we need to avoid the tyriggering of reducers we can
>>> set numReduceTasks to zero but such a setting on mapper wont work. So how
>>> can it be achieved if possible?
>>>
>>> Thank You
>>>
>>> Regards
>>> Bejoy.K.S
>>
>
>

--
Harsh J

This e-mail message may contain privileged and/or confidential information, and is intended to be received only by persons entitled
to receive such information. If you have received this e-mail in error, please notify the sender immediately. Please delete it and
all attachments from any servers, hard drives or any other media. Other use of this e-mail by you is strictly prohibited.

All e-mails and attachments sent and received are subject to monitoring, reading and archival by Monsanto, including its
subsidiaries. The recipient of this e-mail is solely responsible for checking for the presence of "Viruses" or other "Malware".
Monsanto, along with its subsidiaries, accepts no liability for any damage caused by any such code transmitted by or accompanying
this e-mail or any attachment.

The information contained in this email may be subject to the export control laws and regulations of the United States, potentially
including but not limited to the Export Administration Regulations (EAR) and sanctions regulations issued by the U.S. Department of
Treasury, Office of Foreign Asset Controls (OFAC).  As a recipient of this information you are obligated to comply with all
applicable U.S. export laws and regulations.

Re: No Mapper but Reducer

Posted by Robert Hafner <te...@tedivm.com>.

You could just have a mapper which sent off the exact values it took in (ie, output k1,v1 as k2,v2). I think that's the best you'll be able to do here.



On Sep 7, 2011, at 4:21 AM, Bejoy KS <be...@gmail.com> wrote:

> Thank You All. Even I have noticed this strange behavior some time back. 
> Now my inital concern still remains.  If I provide my input directory an empty one, yes the map tasks wont be executed .But my reducer needs  input to do the processing/ aggregation. In such a scenario, is there an option to provide input just to the reducer?
> 
> Regards
> Bejoy.K.S
> 
> On Wed, Sep 7, 2011 at 3:09 PM, Sudharsan Sampath <su...@gmail.com> wrote:
> This is true and it took as off by surprise in recent past. Also, it had quite some impact on our job cycles where the size of input is totally random and could also be zero at times. 
> 
> In one of our cycles, we run a lot of jobs. Say we configure X as the num of reducers for a job which does not have any input.
> 
> Y -> No of tasktrackers in the cluster
> 
> H -> Time Interval for Heartbeat response
> 
> With the cdh2 version, the job takes, 
> 
> ( X / Y) * H seconds to complete without doing any work since we assign only one reduce task per heartbeat
> 
> 
> If the number of such jobs in the cycle is more, then the total time that the cluster spends doing nothing accumulates.
> 
> I was thinking of raising this as a jira but not sure. Should we raise and fix this as jira request? Num of reducers set by the client can be overriden if the number of mappers is 0?
> 
> We have a way to hack, by verifying the existence of the input path to the Map phase ourselves but just thought would be more intuitive for the framework to handle itself
> 
> -Sudhan S
> 
> On Wed, Sep 7, 2011 at 2:25 PM, Harsh J <ha...@cloudera.com> wrote:
> Oh boy are you in for a surprise. Reducers _can_ run with 0 mappers in a job ;-)
> 
> /me puts his troll-mask on.
> 
> ➜  ~HADOOP_HOME  hadoop fs -mkdir abc
> ➜  ~HADOOP_HOME  hadoop jar hadoop-examples-0.20.2-cdh3u1.jar wordcount abc out
> 11/09/07 14:24:14 INFO input.FileInputFormat: Total input paths to process : 0
> 11/09/07 14:24:14 INFO mapred.JobClient: Running job: job_201109071413_0001
> 11/09/07 14:24:15 INFO mapred.JobClient:  map 0% reduce 0%
> 11/09/07 14:24:21 INFO mapred.JobClient:  map 0% reduce 100%
> 11/09/07 14:24:22 INFO mapred.JobClient: Job complete: job_201109071413_0001
> 11/09/07 14:24:22 INFO mapred.JobClient: Counters: 13
> 11/09/07 14:24:22 INFO mapred.JobClient:   Job Counters
> 11/09/07 14:24:22 INFO mapred.JobClient:     Launched reduce tasks=1
> 11/09/07 14:24:22 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=2209
> 11/09/07 14:24:22 INFO mapred.JobClient:     Total time spent by all
> reduces waiting after reserving slots (ms)=0
> 11/09/07 14:24:22 INFO mapred.JobClient:     Total time spent by all
> maps waiting after reserving slots (ms)=0
> 11/09/07 14:24:22 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=3113
> 11/09/07 14:24:22 INFO mapred.JobClient:   FileSystemCounters
> 11/09/07 14:24:22 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=59220
> 11/09/07 14:24:22 INFO mapred.JobClient:   Map-Reduce Framework
> 11/09/07 14:24:22 INFO mapred.JobClient:     Reduce input groups=0
> 11/09/07 14:24:22 INFO mapred.JobClient:     Combine output records=0
> 11/09/07 14:24:22 INFO mapred.JobClient:     Reduce shuffle bytes=0
> 11/09/07 14:24:22 INFO mapred.JobClient:     Reduce output records=0
> 11/09/07 14:24:22 INFO mapred.JobClient:     Spilled Records=0
> 11/09/07 14:24:22 INFO mapred.JobClient:     Combine input records=0
> 11/09/07 14:24:22 INFO mapred.JobClient:     Reduce input records=0
> 
> /me takes off troll mask.
> 
> On Wed, Sep 7, 2011 at 1:30 PM, Bejoy KS <be...@gmail.com> wrote:
> > Thanks Sonal. I was just thinking of some weird design and wanted to make
> > sure whether there is a possibility like that- no maps and all reducers.
> >
> > On Wed, Sep 7, 2011 at 1:22 PM, Sonal Goyal <so...@gmail.com> wrote:
> >>
> >> I dont think that is possible, can you explain in what scenario you want
> >> to have no mappers, only reducers?
> >> Best Regards,
> >> Sonal
> >> Crux: Reporting for HBase
> >> Nube Technologies
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> On Wed, Sep 7, 2011 at 1:18 PM, Bejoy KS <be...@gmail.com> wrote:
> >>>
> >>> Hi
> >>>           I'm having a query here. Is it possible to have no mappers but
> >>> reducers alone? AFAIK If we need to avoid the tyriggering of reducers we can
> >>> set numReduceTasks to zero but such a setting on mapper wont work. So how
> >>> can it be achieved if possible?
> >>>
> >>> Thank You
> >>>
> >>> Regards
> >>> Bejoy.K.S
> >>
> >
> >
> 
> 
> 
> --
> Harsh J
> 
>

Re: No Mapper but Reducer

Posted by Bejoy KS <be...@gmail.com>.

Thank You All. Even I have noticed this strange behavior some time back.
Now my inital concern still remains.  If I provide my input directory an
empty one, yes the map tasks wont be executed .But my reducer needs  input
to do the processing/ aggregation. In such a scenario, is there an option to
provide input just to the reducer?

Regards
Bejoy.K.S

On Wed, Sep 7, 2011 at 3:09 PM, Sudharsan Sampath <su...@gmail.com>wrote:

> This is true and it took as off by surprise in recent past. Also, it had
> quite some impact on our job cycles where the size of input is totally
> random and could also be zero at times.
>
> In one of our cycles, we run a lot of jobs. Say we configure X as the num
> of reducers for a job which does not have any input.
>
> Y -> No of tasktrackers in the cluster
>
> H -> Time Interval for Heartbeat response
>
> With the cdh2 version, the job takes,
>
> ( X / Y) * H seconds to complete without doing any work since we assign
> only one reduce task per heartbeat
>
>
> If the number of such jobs in the cycle is more, then the total time that
> the cluster spends doing nothing accumulates.
>
> I was thinking of raising this as a jira but not sure. Should we raise and
> fix this as jira request? Num of reducers set by the client can be overriden
> if the number of mappers is 0?
>
> We have a way to hack, by verifying the existence of the input path to the
> Map phase ourselves but just thought would be more intuitive for the
> framework to handle itself
>
> -Sudhan S
>
> On Wed, Sep 7, 2011 at 2:25 PM, Harsh J <ha...@cloudera.com> wrote:
>
>> Oh boy are you in for a surprise. Reducers _can_ run with 0 mappers in a
>> job ;-)
>>
>> /me puts his troll-mask on.
>>
>> ➜  ~HADOOP_HOME  hadoop fs -mkdir abc
>> ➜  ~HADOOP_HOME  hadoop jar hadoop-examples-0.20.2-cdh3u1.jar wordcount
>> abc out
>> 11/09/07 14:24:14 INFO input.FileInputFormat: Total input paths to process
>> : 0
>> 11/09/07 14:24:14 INFO mapred.JobClient: Running job:
>> job_201109071413_0001
>> 11/09/07 14:24:15 INFO mapred.JobClient:  map 0% reduce 0%
>> 11/09/07 14:24:21 INFO mapred.JobClient:  map 0% reduce 100%
>> 11/09/07 14:24:22 INFO mapred.JobClient: Job complete:
>> job_201109071413_0001
>> 11/09/07 14:24:22 INFO mapred.JobClient: Counters: 13
>> 11/09/07 14:24:22 INFO mapred.JobClient:   Job Counters
>> 11/09/07 14:24:22 INFO mapred.JobClient:     Launched reduce tasks=1
>> 11/09/07 14:24:22 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=2209
>> 11/09/07 14:24:22 INFO mapred.JobClient:     Total time spent by all
>> reduces waiting after reserving slots (ms)=0
>> 11/09/07 14:24:22 INFO mapred.JobClient:     Total time spent by all
>> maps waiting after reserving slots (ms)=0
>> 11/09/07 14:24:22 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=3113
>> 11/09/07 14:24:22 INFO mapred.JobClient:   FileSystemCounters
>> 11/09/07 14:24:22 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=59220
>> 11/09/07 14:24:22 INFO mapred.JobClient:   Map-Reduce Framework
>> 11/09/07 14:24:22 INFO mapred.JobClient:     Reduce input groups=0
>> 11/09/07 14:24:22 INFO mapred.JobClient:     Combine output records=0
>> 11/09/07 14:24:22 INFO mapred.JobClient:     Reduce shuffle bytes=0
>> 11/09/07 14:24:22 INFO mapred.JobClient:     Reduce output records=0
>> 11/09/07 14:24:22 INFO mapred.JobClient:     Spilled Records=0
>> 11/09/07 14:24:22 INFO mapred.JobClient:     Combine input records=0
>> 11/09/07 14:24:22 INFO mapred.JobClient:     Reduce input records=0
>>
>> /me takes off troll mask.
>>
>> On Wed, Sep 7, 2011 at 1:30 PM, Bejoy KS <be...@gmail.com> wrote:
>> > Thanks Sonal. I was just thinking of some weird design and wanted to
>> make
>> > sure whether there is a possibility like that- no maps and all reducers.
>> >
>> > On Wed, Sep 7, 2011 at 1:22 PM, Sonal Goyal <so...@gmail.com>
>> wrote:
>> >>
>> >> I dont think that is possible, can you explain in what scenario you
>> want
>> >> to have no mappers, only reducers?
>> >> Best Regards,
>> >> Sonal
>> >> Crux: Reporting for HBase
>> >> Nube Technologies
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> On Wed, Sep 7, 2011 at 1:18 PM, Bejoy KS <be...@gmail.com>
>> wrote:
>> >>>
>> >>> Hi
>> >>>           I'm having a query here. Is it possible to have no mappers
>> but
>> >>> reducers alone? AFAIK If we need to avoid the tyriggering of reducers
>> we can
>> >>> set numReduceTasks to zero but such a setting on mapper wont work. So
>> how
>> >>> can it be achieved if possible?
>> >>>
>> >>> Thank You
>> >>>
>> >>> Regards
>> >>> Bejoy.K.S
>> >>
>> >
>> >
>>
>>
>>
>> --
>> Harsh J
>>
>
>

Re: No Mapper but Reducer

Posted by Sudharsan Sampath <su...@gmail.com>.

This is true and it took as off by surprise in recent past. Also, it had
quite some impact on our job cycles where the size of input is totally
random and could also be zero at times.

In one of our cycles, we run a lot of jobs. Say we configure X as the num of
reducers for a job which does not have any input.

Y -> No of tasktrackers in the cluster

H -> Time Interval for Heartbeat response

With the cdh2 version, the job takes,

( X / Y) * H seconds to complete without doing any work since we assign only
one reduce task per heartbeat


If the number of such jobs in the cycle is more, then the total time that
the cluster spends doing nothing accumulates.

I was thinking of raising this as a jira but not sure. Should we raise and
fix this as jira request? Num of reducers set by the client can be overriden
if the number of mappers is 0?

We have a way to hack, by verifying the existence of the input path to the
Map phase ourselves but just thought would be more intuitive for the
framework to handle itself

-Sudhan S

On Wed, Sep 7, 2011 at 2:25 PM, Harsh J <ha...@cloudera.com> wrote:

> Oh boy are you in for a surprise. Reducers _can_ run with 0 mappers in a
> job ;-)
>
> /me puts his troll-mask on.
>
> ➜  ~HADOOP_HOME  hadoop fs -mkdir abc
> ➜  ~HADOOP_HOME  hadoop jar hadoop-examples-0.20.2-cdh3u1.jar wordcount abc
> out
> 11/09/07 14:24:14 INFO input.FileInputFormat: Total input paths to process
> : 0
> 11/09/07 14:24:14 INFO mapred.JobClient: Running job: job_201109071413_0001
> 11/09/07 14:24:15 INFO mapred.JobClient:  map 0% reduce 0%
> 11/09/07 14:24:21 INFO mapred.JobClient:  map 0% reduce 100%
> 11/09/07 14:24:22 INFO mapred.JobClient: Job complete:
> job_201109071413_0001
> 11/09/07 14:24:22 INFO mapred.JobClient: Counters: 13
> 11/09/07 14:24:22 INFO mapred.JobClient:   Job Counters
> 11/09/07 14:24:22 INFO mapred.JobClient:     Launched reduce tasks=1
> 11/09/07 14:24:22 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=2209
> 11/09/07 14:24:22 INFO mapred.JobClient:     Total time spent by all
> reduces waiting after reserving slots (ms)=0
> 11/09/07 14:24:22 INFO mapred.JobClient:     Total time spent by all
> maps waiting after reserving slots (ms)=0
> 11/09/07 14:24:22 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=3113
> 11/09/07 14:24:22 INFO mapred.JobClient:   FileSystemCounters
> 11/09/07 14:24:22 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=59220
> 11/09/07 14:24:22 INFO mapred.JobClient:   Map-Reduce Framework
> 11/09/07 14:24:22 INFO mapred.JobClient:     Reduce input groups=0
> 11/09/07 14:24:22 INFO mapred.JobClient:     Combine output records=0
> 11/09/07 14:24:22 INFO mapred.JobClient:     Reduce shuffle bytes=0
> 11/09/07 14:24:22 INFO mapred.JobClient:     Reduce output records=0
> 11/09/07 14:24:22 INFO mapred.JobClient:     Spilled Records=0
> 11/09/07 14:24:22 INFO mapred.JobClient:     Combine input records=0
> 11/09/07 14:24:22 INFO mapred.JobClient:     Reduce input records=0
>
> /me takes off troll mask.
>
> On Wed, Sep 7, 2011 at 1:30 PM, Bejoy KS <be...@gmail.com> wrote:
> > Thanks Sonal. I was just thinking of some weird design and wanted to make
> > sure whether there is a possibility like that- no maps and all reducers.
> >
> > On Wed, Sep 7, 2011 at 1:22 PM, Sonal Goyal <so...@gmail.com>
> wrote:
> >>
> >> I dont think that is possible, can you explain in what scenario you want
> >> to have no mappers, only reducers?
> >> Best Regards,
> >> Sonal
> >> Crux: Reporting for HBase
> >> Nube Technologies
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> On Wed, Sep 7, 2011 at 1:18 PM, Bejoy KS <be...@gmail.com>
> wrote:
> >>>
> >>> Hi
> >>>           I'm having a query here. Is it possible to have no mappers
> but
> >>> reducers alone? AFAIK If we need to avoid the tyriggering of reducers
> we can
> >>> set numReduceTasks to zero but such a setting on mapper wont work. So
> how
> >>> can it be achieved if possible?
> >>>
> >>> Thank You
> >>>
> >>> Regards
> >>> Bejoy.K.S
> >>
> >
> >
>
>
>
> --
> Harsh J
>

Re: No Mapper but Reducer

Posted by Harsh J <ha...@cloudera.com>.

Oh boy are you in for a surprise. Reducers _can_ run with 0 mappers in a job ;-)

/me puts his troll-mask on.

➜  ~HADOOP_HOME  hadoop fs -mkdir abc
➜  ~HADOOP_HOME  hadoop jar hadoop-examples-0.20.2-cdh3u1.jar wordcount abc out
11/09/07 14:24:14 INFO input.FileInputFormat: Total input paths to process : 0
11/09/07 14:24:14 INFO mapred.JobClient: Running job: job_201109071413_0001
11/09/07 14:24:15 INFO mapred.JobClient:  map 0% reduce 0%
11/09/07 14:24:21 INFO mapred.JobClient:  map 0% reduce 100%
11/09/07 14:24:22 INFO mapred.JobClient: Job complete: job_201109071413_0001
11/09/07 14:24:22 INFO mapred.JobClient: Counters: 13
11/09/07 14:24:22 INFO mapred.JobClient:   Job Counters
11/09/07 14:24:22 INFO mapred.JobClient:     Launched reduce tasks=1
11/09/07 14:24:22 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=2209
11/09/07 14:24:22 INFO mapred.JobClient:     Total time spent by all
reduces waiting after reserving slots (ms)=0
11/09/07 14:24:22 INFO mapred.JobClient:     Total time spent by all
maps waiting after reserving slots (ms)=0
11/09/07 14:24:22 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=3113
11/09/07 14:24:22 INFO mapred.JobClient:   FileSystemCounters
11/09/07 14:24:22 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=59220
11/09/07 14:24:22 INFO mapred.JobClient:   Map-Reduce Framework
11/09/07 14:24:22 INFO mapred.JobClient:     Reduce input groups=0
11/09/07 14:24:22 INFO mapred.JobClient:     Combine output records=0
11/09/07 14:24:22 INFO mapred.JobClient:     Reduce shuffle bytes=0
11/09/07 14:24:22 INFO mapred.JobClient:     Reduce output records=0
11/09/07 14:24:22 INFO mapred.JobClient:     Spilled Records=0
11/09/07 14:24:22 INFO mapred.JobClient:     Combine input records=0
11/09/07 14:24:22 INFO mapred.JobClient:     Reduce input records=0

/me takes off troll mask.

On Wed, Sep 7, 2011 at 1:30 PM, Bejoy KS <be...@gmail.com> wrote:
> Thanks Sonal. I was just thinking of some weird design and wanted to make
> sure whether there is a possibility like that- no maps and all reducers.
>
> On Wed, Sep 7, 2011 at 1:22 PM, Sonal Goyal <so...@gmail.com> wrote:
>>
>> I dont think that is possible, can you explain in what scenario you want
>> to have no mappers, only reducers?
>> Best Regards,
>> Sonal
>> Crux: Reporting for HBase
>> Nube Technologies
>>
>>
>>
>>
>>
>>
>>
>> On Wed, Sep 7, 2011 at 1:18 PM, Bejoy KS <be...@gmail.com> wrote:
>>>
>>> Hi
>>>           I'm having a query here. Is it possible to have no mappers but
>>> reducers alone? AFAIK If we need to avoid the tyriggering of reducers we can
>>> set numReduceTasks to zero but such a setting on mapper wont work. So how
>>> can it be achieved if possible?
>>>
>>> Thank You
>>>
>>> Regards
>>> Bejoy.K.S
>>
>
>



-- 
Harsh J

Re: No Mapper but Reducer

Posted by Bejoy KS <be...@gmail.com>.

Thanks Sonal. I was just thinking of some weird design and wanted to make
sure whether there is a possibility like that- no maps and all reducers.

On Wed, Sep 7, 2011 at 1:22 PM, Sonal Goyal <so...@gmail.com> wrote:

> I dont think that is possible, can you explain in what scenario you want to
> have no mappers, only reducers?
>
> Best Regards,
> Sonal
> Crux: Reporting for HBase <https://github.com/sonalgoyal/crux>
> Nube Technologies <http://www.nubetech.co>
>
> <http://in.linkedin.com/in/sonalgoyal>
>
>
>
>
>
>
> On Wed, Sep 7, 2011 at 1:18 PM, Bejoy KS <be...@gmail.com> wrote:
>
>> Hi
>>           I'm having a query here. Is it possible to have no mappers but
>> reducers alone? AFAIK If we need to avoid the tyriggering of reducers we can
>> set numReduceTasks to zero but such a setting on mapper wont work. So how
>> can it be achieved if possible?
>>
>> Thank You
>>
>> Regards
>> Bejoy.K.S
>>
>
>

Re: No Mapper but Reducer

Posted by Sonal Goyal <so...@gmail.com>.

I dont think that is possible, can you explain in what scenario you want to
have no mappers, only reducers?

Best Regards,
Sonal
Crux: Reporting for HBase <https://github.com/sonalgoyal/crux>
Nube Technologies <http://www.nubetech.co>

<http://in.linkedin.com/in/sonalgoyal>





On Wed, Sep 7, 2011 at 1:18 PM, Bejoy KS <be...@gmail.com> wrote:

> Hi
>           I'm having a query here. Is it possible to have no mappers but
> reducers alone? AFAIK If we need to avoid the tyriggering of reducers we can
> set numReduceTasks to zero but such a setting on mapper wont work. So how
> can it be achieved if possible?
>
> Thank You
>
> Regards
> Bejoy.K.S
>