You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Chaman Singh Verma <cs...@yahoo.com> on 2008/04/16 17:28:30 UTC
Aborting Map Function
Hello,
I am developing one application with MapReduce and in that whenever some
MapTask condition is
met, I would like to broadcast to all other MapTask to abort their work. I
am not quite sure whether
such broadcasting functionality currently exist in Hadoop MapReduce. Could
someone give some
hints.
Although extending this functionality may be easy as all the slaves
periodically ping the master,
I was just thinking of piggybacking one bit information from the slave to
the master and master
may send this information to all the slaves in the next round. Any
suggestions to this approach ?
Thanks.
With Regards
-----
Chaman Singh Verma
Poona, India
--
View this message in context: http://www.nabble.com/Aborting-Map-Function-tp16722552p16722552.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.
Re: Aborting Map Function
Posted by Andrzej Bialecki <ab...@getopt.org>.
Owen O'Malley wrote:
> On Apr 16, 2008, at 8:28 AM, Chaman Singh Verma wrote:
>
>> I am developing one application with MapReduce and in that whenever some
>> MapTask condition is
>> met, I would like to broadcast to all other MapTask to abort their
>> work. I
>> am not quite sure whether
>> such broadcasting functionality currently exist in Hadoop MapReduce.
>> Could
>> someone give some
>> hints.
>
> This is pretty atypical behavior, but you could have each map look for
> the existence of an hdfs file every 1 minute or so. When the condition
> is true, create the file and your maps will exit in the next minute.
> Except on very large clusters, that wouldn't be too expensive...
See also HADOOP-490. I use the message queue facility in my applications
(HADOOP-368) but it works only for infrequent communication and smaller
clusters.
I still think that the job control protocol should allow sending
"signals" to all tasks of a job. This would eliminate the need for
polling, because applications could use a simple listener.
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com
Re: Aborting Map Function
Posted by Milind Bhandarkar <mi...@yahoo-inc.com>.
If you want to kill the whole job (I assume that's what you mean by
"aborting all map tasks") from a mapper, you can use:
new JobClient(jobConf).getJob(job.get("mapred.job.id")).killJob();
- milind
On 4/16/08 10:25 AM, "Owen O'Malley" <oo...@yahoo-inc.com> wrote:
> On Apr 16, 2008, at 8:28 AM, Chaman Singh Verma wrote:
>
>> I am developing one application with MapReduce and in that whenever
>> some
>> MapTask condition is
>> met, I would like to broadcast to all other MapTask to abort their
>> work. I
>> am not quite sure whether
>> such broadcasting functionality currently exist in Hadoop
>> MapReduce. Could
>> someone give some
>> hints.
>
> This is pretty atypical behavior, but you could have each map look
> for the existence of an hdfs file every 1 minute or so. When the
> condition is true, create the file and your maps will exit in the
> next minute. Except on very large clusters, that wouldn't be too
> expensive...
>
> -- Owen
- Milind
--
Milind Bhandarkar, Chief Spammer, Grid Team
Y!IM: GridSolutions
408-349-2136
(milindb@yahoo-inc.com)
Re: Aborting Map Function
Posted by Owen O'Malley <oo...@yahoo-inc.com>.
On Apr 16, 2008, at 8:28 AM, Chaman Singh Verma wrote:
> I am developing one application with MapReduce and in that whenever
> some
> MapTask condition is
> met, I would like to broadcast to all other MapTask to abort their
> work. I
> am not quite sure whether
> such broadcasting functionality currently exist in Hadoop
> MapReduce. Could
> someone give some
> hints.
This is pretty atypical behavior, but you could have each map look
for the existence of an hdfs file every 1 minute or so. When the
condition is true, create the file and your maps will exit in the
next minute. Except on very large clusters, that wouldn't be too
expensive...
-- Owen
Re: Aborting Map Function
Posted by Sagar Naik <sa...@visvo.com>.
Chaman Singh Verma wrote:
> Hello,
>
> I am developing one application with MapReduce and in that whenever some
> MapTask condition is
> met, I would like to broadcast to all other MapTask to abort their work. I
> am not quite sure whether
> such broadcasting functionality currently exist in Hadoop MapReduce. Could
> someone give some
> hints.
>
> Although extending this functionality may be easy as all the slaves
> periodically ping the master,
> I was just thinking of piggybacking one bit information from the slave to
> the master and master
> may send this information to all the slaves in the next round. Any
> suggestions to this approach ?
>
> Thanks.
>
> With Regards
>
> -----
> Chaman Singh Verma
> Poona, India
>
One possible solution could be to use Counters
(http://hadoop.apache.org/core/docs/r0.16.2/api/org/apache/hadoop/mapred/Counters.html)
Though it is advisable to look into details of implementation of it, and
see if it can be used for multi-process shared variable