You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Vadim Zaliva <kr...@gmail.com> on 2009/02/09 04:27:36 UTC

lost TaskTrackers

Hi!

I am observing strange situation in my Hadoop cluster. While running
task, eventually it gets into
this strange mode where:

1. JobTracker reports 0 task trackers.

2. Task tracker processes are alive but log file is full of repeating
messages like this:

2009-02-08 19:16:47,761 INFO org.apache.hadoop.mapred.TaskRunner: attempt_200902
081049_0001_m_017698_0 done; removing files.
2009-02-08 19:16:47,761 INFO org.apache.hadoop.mapred.IndexCache: Map ID attempt
_200902081049_0001_m_017698_0 not found in cache
2009-02-08 19:16:47,761 INFO org.apache.hadoop.mapred.TaskRunner: attempt_200902
081049_0001_m_021212_0 done; removing files.
2009-02-08 19:16:47,762 INFO org.apache.hadoop.mapred.IndexCache: Map ID attempt
_200902081049_0001_m_021212_0 not found in cache
2009-02-08 19:16:47,762 INFO org.apache.hadoop.mapred.TaskRunner: attempt_200902
081049_0001_m_022133_0 done; removing files.

with new one appearing every couple of seconds.

In the task tracker log, before these repeating messages last 2 exceptions are:

2009-02-08 17:46:51,482 INFO org.apache.hadoop.mapred.TaskTracker:
LaunchTaskAction (registerTask): attempt_200902081049_0001_m_075408_3
2009-02-08 17:46:51,482 INFO org.apache.hadoop.mapred.TaskTracker:
Trying to launch : attempt_200902081049_0001_m_075408_3
2009-02-08 17:46:51,482 INFO org.apache.hadoop.mapred.TaskTracker: In
TaskLauncher, current free slots : 8 and trying to launch
attempt_200902081049_0001_m_07
5408_3
2009-02-08 17:46:51,483 WARN org.apache.hadoop.mapred.TaskTracker:
Error initializing attempt_200902081049_0001_m_075408_3:
java.lang.NullPointerException
        at org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:459)
        at org.apache.hadoop.ipc.Client.call(Client.java:686)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
        at $Proxy5.getFileInfo(Unknown Source)
        at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
        at java.lang.reflect.Method.invoke(Unknown Source)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
        at $Proxy5.getFileInfo(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:578)
        at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:390)
        at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:699)
        at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1636)
        at org.apache.hadoop.mapred.TaskTracker.access$1200(TaskTracker.java:102)
        at org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1602)

2009-02-08 17:46:51,483 INFO org.apache.hadoop.mapred.TaskTracker:
addFreeSlot : current free slots : 8
2009-02-08 17:46:51,483 INFO org.apache.hadoop.mapred.TaskTracker:
Error cleaning up task runner: java.lang.NullPointerException
        at org.apache.hadoop.mapred.TaskTracker$TaskInProgress.cleanup(TaskTracker.java:2298)
        at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1648)
        at org.apache.hadoop.mapred.TaskTracker.access$1200(TaskTracker.java:102)
        at org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1602)

2009-02-08 17:46:55,622 INFO org.apache.hadoop.mapred.TaskTracker:
Received 'KillJobAction' for job: job_200902081049_0001
2009-02-08 17:46:55,622 INFO org.apache.hadoop.mapred.TaskRunner:
attempt_200902081049_0001_m_005647_0 done; removing files.
2009-02-08 17:46:59,270 INFO org.apache.hadoop.mapred.IndexCache: Map
ID attempt_200902081049_0001_m_005647_0 not found in cache

Any suggestions where I should look for the cause of this problem?

Sincerely,
Vadim

P.S. I am using hadoop-0.19.0 on Linux. Java:

java version "1.6.0_12"
Java(TM) SE Runtime Environment (build 1.6.0_12-b04)
Java HotSpot(TM) 64-Bit Server VM (build 11.2-b01, mixed mode)

Re: lost TaskTrackers

Posted by Vadim Zaliva <kr...@gmail.com>.
I am starting to wonder If hadoop 19 stable enough for production?

Vadim


On 2/9/09, Vadim Zaliva <kr...@gmail.com> wrote:
> yes, I can access DFS from the cluster. namenode status seems to be OK
> and I see no errors in namenode log files.
>
> initially all trackers were visible, and 9433 maps completed
> successfully. Then, this was followed by 65975 which were killed. In
> log they all show same error:
>
> Error initializing attempt_200902081049_0001_m_004499_1:
> java.lang.NullPointerException
> 	at org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:459)
> 	at org.apache.hadoop.ipc.Client.call(Client.java:686)
> 	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
> 	at $Proxy5.getFileInfo(Unknown Source)
> 	at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
> 	at java.lang.reflect.Method.invoke(Unknown Source)
> 	at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
> 	at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
> 	at $Proxy5.getFileInfo(Unknown Source)
> 	at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:578)
> 	at
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:390)
> 	at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:699)
> 	at
> org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1636)
> 	at org.apache.hadoop.mapred.TaskTracker.access$1200(TaskTracker.java:102)
> 	at
> org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1602)
>
> While this is happening, I can access Job tracker web interface, but
> it shows that there is 0 nodes in the cluster. I have tried to run
> this task several times and the result is always the same. It works at
> first and then starts failing.
>
> Vadim
>
> On Sun, Feb 8, 2009 at 22:19, Amar Kamat <am...@yahoo-inc.com> wrote:
>> Vadim Zaliva wrote:
>>>
>>> Hi!
>>>
>>> I am observing strange situation in my Hadoop cluster. While running
>>> task, eventually it gets into
>>> this strange mode where:
>>>
>>> 1. JobTracker reports 0 task trackers.
>>>
>>> 2. Task tracker processes are alive but log file is full of repeating
>>> messages like this:
>>>
>>> 2009-02-08 19:16:47,761 INFO org.apache.hadoop.mapred.TaskRunner:
>>> attempt_200902
>>> 081049_0001_m_017698_0 done; removing files.
>>> 2009-02-08 19:16:47,761 INFO org.apache.hadoop.mapred.IndexCache: Map ID
>>> attempt
>>> _200902081049_0001_m_017698_0 not found in cache
>>> 2009-02-08 19:16:47,761 INFO org.apache.hadoop.mapred.TaskRunner:
>>> attempt_200902
>>> 081049_0001_m_021212_0 done; removing files.
>>> 2009-02-08 19:16:47,762 INFO org.apache.hadoop.mapred.IndexCache: Map ID
>>> attempt
>>> _200902081049_0001_m_021212_0 not found in cache
>>> 2009-02-08 19:16:47,762 INFO org.apache.hadoop.mapred.TaskRunner:
>>> attempt_200902
>>> 081049_0001_m_022133_0 done; removing files.
>>>
>>> with new one appearing every couple of seconds.
>>>
>>> In the task tracker log, before these repeating messages last 2
>>> exceptions
>>> are:
>>>
>>> 2009-02-08 17:46:51,482 INFO org.apache.hadoop.mapred.TaskTracker:
>>> LaunchTaskAction (registerTask): attempt_200902081049_0001_m_075408_3
>>> 2009-02-08 17:46:51,482 INFO org.apache.hadoop.mapred.TaskTracker:
>>> Trying to launch : attempt_200902081049_0001_m_075408_3
>>> 2009-02-08 17:46:51,482 INFO org.apache.hadoop.mapred.TaskTracker: In
>>> TaskLauncher, current free slots : 8 and trying to launch
>>> attempt_200902081049_0001_m_07
>>> 5408_3
>>> 2009-02-08 17:46:51,483 WARN org.apache.hadoop.mapred.TaskTracker:
>>> Error initializing attempt_200902081049_0001_m_075408_3:
>>> java.lang.NullPointerException
>>>        at
>>> org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:459)
>>>        at org.apache.hadoop.ipc.Client.call(Client.java:686)
>>>        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>>>        at $Proxy5.getFileInfo(Unknown Source)
>>>        at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>>>        at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown
>>> Source)
>>>        at java.lang.reflect.Method.invoke(Unknown Source)
>>>        at
>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
>>>        at
>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
>>>        at $Proxy5.getFileInfo(Unknown Source)
>>>        at
>>> org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:578)
>>>        at
>>> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:390)
>>>        at
>>> org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:699)
>>>        at
>>> org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1636)
>>>        at
>>> org.apache.hadoop.mapred.TaskTracker.access$1200(TaskTracker.java:102)
>>>        at
>>> org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1602)
>>>
>>>
>>
>> Looks like an RPC issue. Can you tell more about the cluster? Is there a
>> task that finished successfully in this job? Can you access the dfs from
>> the
>> trackers?
>>>
>>> 2009-02-08 17:46:51,483 INFO org.apache.hadoop.mapred.TaskTracker:
>>> addFreeSlot : current free slots : 8
>>> 2009-02-08 17:46:51,483 INFO org.apache.hadoop.mapred.TaskTracker:
>>> Error cleaning up task runner: java.lang.NullPointerException
>>>        at
>>> org.apache.hadoop.mapred.TaskTracker$TaskInProgress.cleanup(TaskTracker.java:2298)
>>>        at
>>> org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1648)
>>>        at
>>> org.apache.hadoop.mapred.TaskTracker.access$1200(TaskTracker.java:102)
>>>        at
>>> org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1602)
>>>
>>> 2009-02-08 17:46:55,622 INFO org.apache.hadoop.mapred.TaskTracker:
>>> Received 'KillJobAction' for job: job_200902081049_0001
>>> 2009-02-08 17:46:55,622 INFO org.apache.hadoop.mapred.TaskRunner:
>>> attempt_200902081049_0001_m_005647_0 done; removing files.
>>> 2009-02-08 17:46:59,270 INFO org.apache.hadoop.mapred.IndexCache: Map
>>> ID attempt_200902081049_0001_m_005647_0 not found in cache
>>>
>>> Any suggestions where I should look for the cause of this problem?
>>>
>>> Sincerely,
>>> Vadim
>>>
>>> P.S. I am using hadoop-0.19.0 on Linux. Java:
>>>
>>> java version "1.6.0_12"
>>> Java(TM) SE Runtime Environment (build 1.6.0_12-b04)
>>> Java HotSpot(TM) 64-Bit Server VM (build 11.2-b01, mixed mode)
>>>
>>
>>
>

Re: lost TaskTrackers

Posted by Vadim Zaliva <kr...@gmail.com>.
yes, I can access DFS from the cluster. namenode status seems to be OK
and I see no errors in namenode log files.

initially all trackers were visible, and 9433 maps completed
successfully. Then, this was followed by 65975 which were killed. In
log they all show same error:

Error initializing attempt_200902081049_0001_m_004499_1:
java.lang.NullPointerException
	at org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:459)
	at org.apache.hadoop.ipc.Client.call(Client.java:686)
	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
	at $Proxy5.getFileInfo(Unknown Source)
	at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
	at java.lang.reflect.Method.invoke(Unknown Source)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
	at $Proxy5.getFileInfo(Unknown Source)
	at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:578)
	at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:390)
	at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:699)
	at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1636)
	at org.apache.hadoop.mapred.TaskTracker.access$1200(TaskTracker.java:102)
	at org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1602)

While this is happening, I can access Job tracker web interface, but
it shows that there is 0 nodes in the cluster. I have tried to run
this task several times and the result is always the same. It works at
first and then starts failing.

Vadim

On Sun, Feb 8, 2009 at 22:19, Amar Kamat <am...@yahoo-inc.com> wrote:
> Vadim Zaliva wrote:
>>
>> Hi!
>>
>> I am observing strange situation in my Hadoop cluster. While running
>> task, eventually it gets into
>> this strange mode where:
>>
>> 1. JobTracker reports 0 task trackers.
>>
>> 2. Task tracker processes are alive but log file is full of repeating
>> messages like this:
>>
>> 2009-02-08 19:16:47,761 INFO org.apache.hadoop.mapred.TaskRunner:
>> attempt_200902
>> 081049_0001_m_017698_0 done; removing files.
>> 2009-02-08 19:16:47,761 INFO org.apache.hadoop.mapred.IndexCache: Map ID
>> attempt
>> _200902081049_0001_m_017698_0 not found in cache
>> 2009-02-08 19:16:47,761 INFO org.apache.hadoop.mapred.TaskRunner:
>> attempt_200902
>> 081049_0001_m_021212_0 done; removing files.
>> 2009-02-08 19:16:47,762 INFO org.apache.hadoop.mapred.IndexCache: Map ID
>> attempt
>> _200902081049_0001_m_021212_0 not found in cache
>> 2009-02-08 19:16:47,762 INFO org.apache.hadoop.mapred.TaskRunner:
>> attempt_200902
>> 081049_0001_m_022133_0 done; removing files.
>>
>> with new one appearing every couple of seconds.
>>
>> In the task tracker log, before these repeating messages last 2 exceptions
>> are:
>>
>> 2009-02-08 17:46:51,482 INFO org.apache.hadoop.mapred.TaskTracker:
>> LaunchTaskAction (registerTask): attempt_200902081049_0001_m_075408_3
>> 2009-02-08 17:46:51,482 INFO org.apache.hadoop.mapred.TaskTracker:
>> Trying to launch : attempt_200902081049_0001_m_075408_3
>> 2009-02-08 17:46:51,482 INFO org.apache.hadoop.mapred.TaskTracker: In
>> TaskLauncher, current free slots : 8 and trying to launch
>> attempt_200902081049_0001_m_07
>> 5408_3
>> 2009-02-08 17:46:51,483 WARN org.apache.hadoop.mapred.TaskTracker:
>> Error initializing attempt_200902081049_0001_m_075408_3:
>> java.lang.NullPointerException
>>        at
>> org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:459)
>>        at org.apache.hadoop.ipc.Client.call(Client.java:686)
>>        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>>        at $Proxy5.getFileInfo(Unknown Source)
>>        at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>>        at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>>        at java.lang.reflect.Method.invoke(Unknown Source)
>>        at
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
>>        at
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
>>        at $Proxy5.getFileInfo(Unknown Source)
>>        at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:578)
>>        at
>> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:390)
>>        at
>> org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:699)
>>        at
>> org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1636)
>>        at
>> org.apache.hadoop.mapred.TaskTracker.access$1200(TaskTracker.java:102)
>>        at
>> org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1602)
>>
>>
>
> Looks like an RPC issue. Can you tell more about the cluster? Is there a
> task that finished successfully in this job? Can you access the dfs from the
> trackers?
>>
>> 2009-02-08 17:46:51,483 INFO org.apache.hadoop.mapred.TaskTracker:
>> addFreeSlot : current free slots : 8
>> 2009-02-08 17:46:51,483 INFO org.apache.hadoop.mapred.TaskTracker:
>> Error cleaning up task runner: java.lang.NullPointerException
>>        at
>> org.apache.hadoop.mapred.TaskTracker$TaskInProgress.cleanup(TaskTracker.java:2298)
>>        at
>> org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1648)
>>        at
>> org.apache.hadoop.mapred.TaskTracker.access$1200(TaskTracker.java:102)
>>        at
>> org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1602)
>>
>> 2009-02-08 17:46:55,622 INFO org.apache.hadoop.mapred.TaskTracker:
>> Received 'KillJobAction' for job: job_200902081049_0001
>> 2009-02-08 17:46:55,622 INFO org.apache.hadoop.mapred.TaskRunner:
>> attempt_200902081049_0001_m_005647_0 done; removing files.
>> 2009-02-08 17:46:59,270 INFO org.apache.hadoop.mapred.IndexCache: Map
>> ID attempt_200902081049_0001_m_005647_0 not found in cache
>>
>> Any suggestions where I should look for the cause of this problem?
>>
>> Sincerely,
>> Vadim
>>
>> P.S. I am using hadoop-0.19.0 on Linux. Java:
>>
>> java version "1.6.0_12"
>> Java(TM) SE Runtime Environment (build 1.6.0_12-b04)
>> Java HotSpot(TM) 64-Bit Server VM (build 11.2-b01, mixed mode)
>>
>
>

Re: lost TaskTrackers

Posted by Amar Kamat <am...@yahoo-inc.com>.
Vadim Zaliva wrote:
> Hi!
>
> I am observing strange situation in my Hadoop cluster. While running
> task, eventually it gets into
> this strange mode where:
>
> 1. JobTracker reports 0 task trackers.
>
> 2. Task tracker processes are alive but log file is full of repeating
> messages like this:
>
> 2009-02-08 19:16:47,761 INFO org.apache.hadoop.mapred.TaskRunner: attempt_200902
> 081049_0001_m_017698_0 done; removing files.
> 2009-02-08 19:16:47,761 INFO org.apache.hadoop.mapred.IndexCache: Map ID attempt
> _200902081049_0001_m_017698_0 not found in cache
> 2009-02-08 19:16:47,761 INFO org.apache.hadoop.mapred.TaskRunner: attempt_200902
> 081049_0001_m_021212_0 done; removing files.
> 2009-02-08 19:16:47,762 INFO org.apache.hadoop.mapred.IndexCache: Map ID attempt
> _200902081049_0001_m_021212_0 not found in cache
> 2009-02-08 19:16:47,762 INFO org.apache.hadoop.mapred.TaskRunner: attempt_200902
> 081049_0001_m_022133_0 done; removing files.
>
> with new one appearing every couple of seconds.
>
> In the task tracker log, before these repeating messages last 2 exceptions are:
>
> 2009-02-08 17:46:51,482 INFO org.apache.hadoop.mapred.TaskTracker:
> LaunchTaskAction (registerTask): attempt_200902081049_0001_m_075408_3
> 2009-02-08 17:46:51,482 INFO org.apache.hadoop.mapred.TaskTracker:
> Trying to launch : attempt_200902081049_0001_m_075408_3
> 2009-02-08 17:46:51,482 INFO org.apache.hadoop.mapred.TaskTracker: In
> TaskLauncher, current free slots : 8 and trying to launch
> attempt_200902081049_0001_m_07
> 5408_3
> 2009-02-08 17:46:51,483 WARN org.apache.hadoop.mapred.TaskTracker:
> Error initializing attempt_200902081049_0001_m_075408_3:
> java.lang.NullPointerException
>         at org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:459)
>         at org.apache.hadoop.ipc.Client.call(Client.java:686)
>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>         at $Proxy5.getFileInfo(Unknown Source)
>         at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
>         at $Proxy5.getFileInfo(Unknown Source)
>         at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:578)
>         at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:390)
>         at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:699)
>         at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1636)
>         at org.apache.hadoop.mapred.TaskTracker.access$1200(TaskTracker.java:102)
>         at org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1602)
>
>   
Looks like an RPC issue. Can you tell more about the cluster? Is there a 
task that finished successfully in this job? Can you access the dfs from 
the trackers?
> 2009-02-08 17:46:51,483 INFO org.apache.hadoop.mapred.TaskTracker:
> addFreeSlot : current free slots : 8
> 2009-02-08 17:46:51,483 INFO org.apache.hadoop.mapred.TaskTracker:
> Error cleaning up task runner: java.lang.NullPointerException
>         at org.apache.hadoop.mapred.TaskTracker$TaskInProgress.cleanup(TaskTracker.java:2298)
>         at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1648)
>         at org.apache.hadoop.mapred.TaskTracker.access$1200(TaskTracker.java:102)
>         at org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1602)
>
> 2009-02-08 17:46:55,622 INFO org.apache.hadoop.mapred.TaskTracker:
> Received 'KillJobAction' for job: job_200902081049_0001
> 2009-02-08 17:46:55,622 INFO org.apache.hadoop.mapred.TaskRunner:
> attempt_200902081049_0001_m_005647_0 done; removing files.
> 2009-02-08 17:46:59,270 INFO org.apache.hadoop.mapred.IndexCache: Map
> ID attempt_200902081049_0001_m_005647_0 not found in cache
>
> Any suggestions where I should look for the cause of this problem?
>
> Sincerely,
> Vadim
>
> P.S. I am using hadoop-0.19.0 on Linux. Java:
>
> java version "1.6.0_12"
> Java(TM) SE Runtime Environment (build 1.6.0_12-b04)
> Java HotSpot(TM) 64-Bit Server VM (build 11.2-b01, mixed mode)
>