You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Meng Mao <me...@gmail.com> on 2010/02/01 22:20:21 UTC

Repeated attempts to kill old job?

On our worker nodes, I see repeated requests for KillJobActions for the same
old job:
2010-01-31 00:00:01,024 INFO org.apache.hadoop.mapred.TaskTracker: Received
'KillJobAction' for job: job_201001261532_0690
2010-01-31 00:00:01,064 WARN org.apache.hadoop.mapred.TaskTracker: Unknown
job job_201001261532_0690 being deleted.

This request and response is repeated almost 30k times over the course of
the day. Other nodes have the same behavior, except with different job ids.
The jobs presumably all ran in the past, to completion or got killed
manually. We use the grid fairly actively and that job is several hundred
increments old.

Has anyone seen this before? Is there a way to stop it?

Re: Repeated attempts to kill old job?

Posted by Meng Mao <me...@gmail.com>.

We restarted the grid and that did kill the repeated KillJobAction attempts.
I forgot to look around with hadoop -dfsadmin, though.

On Mon, Feb 1, 2010 at 11:29 PM, Rekha Joshi <re...@yahoo-inc.com> wrote:

> I would say restart the cluster, but suspect that would not help either -
> instead try checking up your running process list (eg: perl/shell script or
> a ETL pipeline job) to analyze/kill.
> Also wondering if any hadoop -dfsadmin commands can supersede this
> scenario..
>
> Cheers,
> /R
>
>
> On 2/2/10 2:50 AM, "Meng Mao" <me...@gmail.com> wrote:
>
> On our worker nodes, I see repeated requests for KillJobActions for the
> same
> old job:
> 2010-01-31 00:00:01,024 INFO org.apache.hadoop.mapred.TaskTracker: Received
> 'KillJobAction' for job: job_201001261532_0690
> 2010-01-31 00:00:01,064 WARN org.apache.hadoop.mapred.TaskTracker: Unknown
> job job_201001261532_0690 being deleted.
>
> This request and response is repeated almost 30k times over the course of
> the day. Other nodes have the same behavior, except with different job ids.
> The jobs presumably all ran in the past, to completion or got killed
> manually. We use the grid fairly actively and that job is several hundred
> increments old.
>
> Has anyone seen this before? Is there a way to stop it?
>
>

Re: Repeated attempts to kill old job?

Posted by Rekha Joshi <re...@yahoo-inc.com>.

I would say restart the cluster, but suspect that would not help either - instead try checking up your running process list (eg: perl/shell script or a ETL pipeline job) to analyze/kill.
Also wondering if any hadoop -dfsadmin commands can supersede this scenario..

Cheers,
/R


On 2/2/10 2:50 AM, "Meng Mao" <me...@gmail.com> wrote:

On our worker nodes, I see repeated requests for KillJobActions for the same
old job:
2010-01-31 00:00:01,024 INFO org.apache.hadoop.mapred.TaskTracker: Received
'KillJobAction' for job: job_201001261532_0690
2010-01-31 00:00:01,064 WARN org.apache.hadoop.mapred.TaskTracker: Unknown
job job_201001261532_0690 being deleted.

This request and response is repeated almost 30k times over the course of
the day. Other nodes have the same behavior, except with different job ids.
The jobs presumably all ran in the past, to completion or got killed
manually. We use the grid fairly actively and that job is several hundred
increments old.

Has anyone seen this before? Is there a way to stop it?