You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Adam Shook <as...@clearedgeit.com> on 2011/08/04 00:33:46 UTC

Kill Task Programmatically

Is there any way I can programmatically kill or fail a task, preferably from inside a Mapper or Reducer?

At any time during a map or reduce task, I have a use case where I know it won't succeed based solely on the machine it is running on.  It is rare, but I would prefer to kill the task and have Hadoop start it up on a different machine as usual instead of waiting for the 10 minute default timeout.

I suppose the speculative execution could take care of it, but I would rather not rely on it if I am able to kill it myself.

Thanks,
Adam

Re: Kill Task Programmatically

Posted by Harsh J <ha...@cloudera.com>.

Hello,

Adding to Aleksandr's suggestion, you could also lower the timeout if
a throw condition can't be determined.

On Thu, Aug 4, 2011 at 5:10 AM, Aleksandr Elbakyan <ra...@yahoo.com> wrote:
> Hello,
>
> You can just throw run time exception. In that case it will fail :)
>
> Regards,
> Aleksandr
>
> --- On Wed, 8/3/11, Adam Shook <as...@clearedgeit.com> wrote:
>
> From: Adam Shook <as...@clearedgeit.com>
> Subject: Kill Task Programmatically
> To: "common-user@hadoop.apache.org" <co...@hadoop.apache.org>
> Date: Wednesday, August 3, 2011, 3:33 PM
>
> Is there any way I can programmatically kill or fail a task, preferably from inside a Mapper or Reducer?
>
> At any time during a map or reduce task, I have a use case where I know it won't succeed based solely on the machine it is running on.  It is rare, but I would prefer to kill the task and have Hadoop start it up on a different machine as usual instead of waiting for the 10 minute default timeout.
>
> I suppose the speculative execution could take care of it, but I would rather not rely on it if I am able to kill it myself.
>
> Thanks,
> Adam
>



-- 
Harsh J

RE: Kill Task Programmatically

Posted by Devaraj K <de...@huawei.com>.

Adam,

   You can use RunningJob.killTask(TaskAttemptID taskId, boolean shouldFail)
API to kill the task. 

Clients can get hold of RunningJob via the JobClient and then use
running-job for killing the task etc.


Refer API doc :
http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/Ru
nningJob.html#killTask(org.apache.hadoop.mapred.TaskAttemptID, boolean)


Devaraj K 

-----Original Message-----
From: Aleksandr Elbakyan [mailto:ramalex1@yahoo.com] 
Sent: Thursday, August 04, 2011 5:10 AM
To: common-user@hadoop.apache.org
Subject: Re: Kill Task Programmatically

Hello,

You can just throw run time exception. In that case it will fail :)

Regards,
Aleksandr 

--- On Wed, 8/3/11, Adam Shook <as...@clearedgeit.com> wrote:

From: Adam Shook <as...@clearedgeit.com>
Subject: Kill Task Programmatically
To: "common-user@hadoop.apache.org" <co...@hadoop.apache.org>
Date: Wednesday, August 3, 2011, 3:33 PM

Is there any way I can programmatically kill or fail a task, preferably from
inside a Mapper or Reducer?

At any time during a map or reduce task, I have a use case where I know it
won't succeed based solely on the machine it is running on.  It is rare, but
I would prefer to kill the task and have Hadoop start it up on a different
machine as usual instead of waiting for the 10 minute default timeout.

I suppose the speculative execution could take care of it, but I would
rather not rely on it if I am able to kill it myself.

Thanks,
Adam

Re: Kill Task Programmatically

Posted by Aleksandr Elbakyan <ra...@yahoo.com>.

Hello,

You can just throw run time exception. In that case it will fail :)

Regards,
Aleksandr 

--- On Wed, 8/3/11, Adam Shook <as...@clearedgeit.com> wrote:

From: Adam Shook <as...@clearedgeit.com>
Subject: Kill Task Programmatically
To: "common-user@hadoop.apache.org" <co...@hadoop.apache.org>
Date: Wednesday, August 3, 2011, 3:33 PM

Is there any way I can programmatically kill or fail a task, preferably from inside a Mapper or Reducer?

At any time during a map or reduce task, I have a use case where I know it won't succeed based solely on the machine it is running on.  It is rare, but I would prefer to kill the task and have Hadoop start it up on a different machine as usual instead of waiting for the 10 minute default timeout.

I suppose the speculative execution could take care of it, but I would rather not rely on it if I am able to kill it myself.

Thanks,
Adam