You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Chaman Singh Verma <cs...@yahoo.com> on 2008/04/16 17:28:30 UTC

Aborting Map Function

Hello,

I am developing one application with MapReduce and in that whenever some
MapTask condition is 
met, I would like to broadcast to all other MapTask to abort their work. I
am not quite sure whether
such broadcasting functionality currently exist in Hadoop MapReduce. Could
someone give some
hints.

Although extending this functionality may be easy as all the slaves
periodically ping the master,
I was just thinking of piggybacking one bit information from the slave to
the master and master
may send this information to all the slaves in the next round. Any
suggestions to this approach ?

Thanks.

With Regards 

-----
Chaman Singh Verma
Poona, India
-- 
View this message in context: http://www.nabble.com/Aborting-Map-Function-tp16722552p16722552.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.


Re: Aborting Map Function

Posted by Andrzej Bialecki <ab...@getopt.org>.
Owen O'Malley wrote:
> On Apr 16, 2008, at 8:28 AM, Chaman Singh Verma wrote:
> 
>> I am developing one application with MapReduce and in that whenever some
>> MapTask condition is
>> met, I would like to broadcast to all other MapTask to abort their 
>> work. I
>> am not quite sure whether
>> such broadcasting functionality currently exist in Hadoop MapReduce. 
>> Could
>> someone give some
>> hints.
> 
> This is pretty atypical behavior, but you could have each map look for 
> the existence of an hdfs file every 1 minute or so. When the condition 
> is true, create the file and your maps will exit in the next minute. 
> Except on very large clusters, that wouldn't be too expensive...

See also HADOOP-490. I use the message queue facility in my applications 
(HADOOP-368) but it works only for infrequent communication and smaller 
clusters.

I still think that the job control protocol should allow sending 
"signals" to all tasks of a job. This would eliminate the need for 
polling, because applications could use a simple listener.

-- 
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


Re: Aborting Map Function

Posted by Milind Bhandarkar <mi...@yahoo-inc.com>.
If you want to kill the whole job (I assume that's what you mean by
"aborting all map tasks") from a mapper, you can use:

new JobClient(jobConf).getJob(job.get("mapred.job.id")).killJob();

- milind


On 4/16/08 10:25 AM, "Owen O'Malley" <oo...@yahoo-inc.com> wrote:

> On Apr 16, 2008, at 8:28 AM, Chaman Singh Verma wrote:
> 
>> I am developing one application with MapReduce and in that whenever
>> some
>> MapTask condition is
>> met, I would like to broadcast to all other MapTask to abort their
>> work. I
>> am not quite sure whether
>> such broadcasting functionality currently exist in Hadoop
>> MapReduce. Could
>> someone give some
>> hints.
> 
> This is pretty atypical behavior, but you could have each map look
> for the existence of an hdfs file every 1 minute or so. When the
> condition is true, create the file and your maps will exit in the
> next minute. Except on very large clusters, that wouldn't be too
> expensive...
> 
> -- Owen

- Milind
-- 
Milind Bhandarkar, Chief Spammer, Grid Team
Y!IM: GridSolutions
408-349-2136 
(milindb@yahoo-inc.com)


Re: Aborting Map Function

Posted by Owen O'Malley <oo...@yahoo-inc.com>.
On Apr 16, 2008, at 8:28 AM, Chaman Singh Verma wrote:

> I am developing one application with MapReduce and in that whenever  
> some
> MapTask condition is
> met, I would like to broadcast to all other MapTask to abort their  
> work. I
> am not quite sure whether
> such broadcasting functionality currently exist in Hadoop  
> MapReduce. Could
> someone give some
> hints.

This is pretty atypical behavior, but you could have each map look  
for the existence of an hdfs file every 1 minute or so. When the  
condition is true, create the file and your maps will exit in the  
next minute. Except on very large clusters, that wouldn't be too  
expensive...

-- Owen

Re: Aborting Map Function

Posted by Sagar Naik <sa...@visvo.com>.
Chaman Singh Verma wrote:
> Hello,
>
> I am developing one application with MapReduce and in that whenever some
> MapTask condition is 
> met, I would like to broadcast to all other MapTask to abort their work. I
> am not quite sure whether
> such broadcasting functionality currently exist in Hadoop MapReduce. Could
> someone give some
> hints.
>
> Although extending this functionality may be easy as all the slaves
> periodically ping the master,
> I was just thinking of piggybacking one bit information from the slave to
> the master and master
> may send this information to all the slaves in the next round. Any
> suggestions to this approach ?
>
> Thanks.
>
> With Regards 
>
> -----
> Chaman Singh Verma
> Poona, India
>   
One possible solution could be to use Counters 
(http://hadoop.apache.org/core/docs/r0.16.2/api/org/apache/hadoop/mapred/Counters.html)
Though it is advisable to look into details of implementation of it, and 
see if it can be used for multi-process shared variable