You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Koji Noguchi (JIRA)" <ji...@apache.org> on 2007/03/28 09:19:32 UTC

[jira] Created: (HADOOP-1171) NPE in TaskTracker RenitTrackerAction

NPE in TaskTracker RenitTrackerAction 
--------------------------------------

                 Key: HADOOP-1171
                 URL: https://issues.apache.org/jira/browse/HADOOP-1171
             Project: Hadoop
          Issue Type: Bug
          Components: mapred
            Reporter: Koji Noguchi


TaskTracker log shows =============

2007-03-28 02:32:18,076 INFO org.apache.hadoop.mapred.TaskTracker: Recieved RenitTrackerAction from JobTracker
2007-03-28 02:32:18,494 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.lang.NullPointerException
  at org.apache.hadoop.mapred.TaskTracker$TaskInProgress.jobHasFinished(TaskTracker.java:1187)
  at org.apache.hadoop.mapred.TaskTracker.close(TaskTracker.java:430)
  at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:917)
  at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:1589)

JobTracker log shows ==============
2007-03-28 02:31:18,977 INFO org.apache.hadoop.mapred.JobTracker: Lost tracker '____.____.com'
...
2007-03-28 02:31:18,977 INFO org.apache.hadoop.mapred.JobInProgress: TaskTracker at 'tracker____.___.com' turned 'flaky'
...
2007-03-28 02:32:18,075 WARN org.apache.hadoop.mapred.JobTracker: Status from unknown Tracker : tracker____.___.com:#####


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1171) NPE in TaskTracker RenitTrackerAction

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12526390 ] 

Alejandro Abdelnur commented on HADOOP-1171:
--------------------------------------------

I'm getting the NPE on 0.12.4 when running out of disk space in that box.

Besides hunting the reason for the NPE, the close() method in the finally should be try/catch.

TaskTracker logs ===================

2007-09-10 14:08:12,096 FATAL org.apache.hadoop.mapred.TaskTracker: FSError, exiting: java.io.IOException: No space left on device
2007-09-10 14:08:12,152 WARN org.apache.hadoop.mapred.TaskRunner: task_0263_m_000003_0 Error reading child output
java.io.IOException: No space left on device
        at java.io.FileOutputStream.writeBytes(Native Method)
        at java.io.FileOutputStream.write(FileOutputStream.java:260)
        at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
        at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
        at org.apache.hadoop.mapred.TaskLog$Writer.write(TaskLog.java:212)
        at org.apache.hadoop.mapred.TaskRunner.logStream(TaskRunner.java:439)
        at org.apache.hadoop.mapred.TaskRunner.access$200(TaskRunner.java:33)
        at org.apache.hadoop.mapred.TaskRunner$1.run(TaskRunner.java:398)
2007-09-10 14:08:12,159 WARN org.apache.hadoop.mapred.TaskRunner: task_0263_m_000002_0 Child Error
java.io.IOException: No space left on device
        at java.io.FileOutputStream.writeBytes(Native Method)
        at java.io.FileOutputStream.write(FileOutputStream.java:260)
        at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
        at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
        at org.apache.hadoop.mapred.TaskLog$Writer.writeIndexRecord(TaskLog.java:251)
        at org.apache.hadoop.mapred.TaskLog$Writer.close(TaskLog.java:235)
        at org.apache.hadoop.mapred.TaskRunner.runChild(TaskRunner.java:416)
        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:291)
2007-09-10 14:08:12,176 INFO org.apache.hadoop.mapred.TaskTracker: task_0264_r_000000_0 0.0% reduce > copy >
2007-09-10 14:08:12,644 INFO org.apache.hadoop.mapred.TaskTracker: task_0263_m_000003_0 0.82008606% Success
2007-09-10 14:08:12,919 INFO org.apache.hadoop.mapred.TaskTracker: task_0266_r_000000_0 0.0% reduce > copy >
2007-09-10 14:08:13,178 INFO org.apache.hadoop.mapred.TaskTracker: task_0264_r_000000_0 0.0% reduce > copy >
2007-09-10 14:08:13,648 INFO org.apache.hadoop.mapred.TaskTracker: task_0263_m_000003_0 0.82239765% Success
2007-09-10 14:08:13,921 INFO org.apache.hadoop.mapred.TaskTracker: task_0266_r_000000_0 0.0% reduce > copy >
2007-09-10 14:08:14,181 INFO org.apache.hadoop.mapred.TaskTracker: task_0264_r_000000_0 0.0% reduce > copy >
2007-09-10 14:08:14,652 INFO org.apache.hadoop.mapred.TaskTracker: task_0263_m_000003_0 0.82471263% Success
2007-09-10 14:08:14,854 INFO org.apache.hadoop.mapred.TaskRunner: task_0264_r_000000_0 Need 8 map output(s)
2007-09-10 14:08:14,854 INFO org.apache.hadoop.mapred.TaskRunner: task_0264_r_000000_0 Need 8 map output location(s)
2007-09-10 14:08:14,856 INFO org.apache.hadoop.mapred.TaskRunner: task_0264_r_000000_0 Got 0 new map outputs from jobtracker and 0 map outputs from previous failures
2007-09-10 14:08:14,856 INFO org.apache.hadoop.mapred.TaskRunner: task_0264_r_000000_0 Got 0 known map output location(s); scheduling...
2007-09-10 14:08:14,856 INFO org.apache.hadoop.mapred.TaskRunner: task_0264_r_000000_0 Scheduled 0 of 0 known outputs (0 slow hosts and 0 dup hosts)
2007-09-10 14:08:14,923 INFO org.apache.hadoop.mapred.TaskTracker: task_0266_r_000000_0 0.0% reduce > copy >
2007-09-10 14:08:15,020 INFO org.apache.hadoop.mapred.TaskRunner: task_0266_r_000000_0 Need 2 map output(s)
2007-09-10 14:08:15,022007-09-10 14:08:15,182 INFO org.apache.hadoop.mapred.TaskRunner: task_0258_m_000003_0 done; removing files.
2007-09-10 14:08:15,197 INFO org.apache.hadoop.mapred.TaskRunner: task_0258_m_000005_0 done; removing files.
2007-09-10 14:08:15,211 INFO org.apache.hadoop.mapred.TaskRunner: task_0259_m_000002_0 done; removing files.
2007-09-10 14:08:15,227 INFO org.apache.hadoop.mapred.TaskTracker: task_0264_r_000000_0 0.0% reduce > copy >
2007-09-10 14:08:15,227 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.lang.NullPointerException
        at org.apache.hadoop.mapred.TaskTracker$TaskInProgress.jobHasFinished(TaskTracker.java:1187)
        at org.apache.hadoop.mapred.TaskTracker.close(TaskTracker.java:430)
        at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:917)
        at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:1589)




> NPE in TaskTracker RenitTrackerAction 
> --------------------------------------
>
>                 Key: HADOOP-1171
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1171
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.12.1
>            Reporter: Koji Noguchi
>            Assignee: Tahir Hashmi
>
> TaskTracker log shows =============
> 2007-03-28 02:32:18,076 INFO org.apache.hadoop.mapred.TaskTracker: Recieved RenitTrackerAction from JobTracker
> 2007-03-28 02:32:18,494 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.lang.NullPointerException
>   at org.apache.hadoop.mapred.TaskTracker$TaskInProgress.jobHasFinished(TaskTracker.java:1187)
>   at org.apache.hadoop.mapred.TaskTracker.close(TaskTracker.java:430)
>   at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:917)
>   at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:1589)
> JobTracker log shows ==============
> 2007-03-28 02:31:18,977 INFO org.apache.hadoop.mapred.JobTracker: Lost tracker '____.____.com'
> ...
> 2007-03-28 02:31:18,977 INFO org.apache.hadoop.mapred.JobInProgress: TaskTracker at 'tracker____.___.com' turned 'flaky'
> ...
> 2007-03-28 02:32:18,075 WARN org.apache.hadoop.mapred.JobTracker: Status from unknown Tracker : tracker____.___.com:#####

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1171) NPE in TaskTracker RenitTrackerAction

Posted by "Koji Noguchi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Koji Noguchi updated HADOOP-1171:
---------------------------------

    Affects Version/s: 0.12.1

It was running 0.12.1. Thanks.

> NPE in TaskTracker RenitTrackerAction 
> --------------------------------------
>
>                 Key: HADOOP-1171
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1171
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.12.1
>            Reporter: Koji Noguchi
>         Assigned To: Tahir Hashmi
>
> TaskTracker log shows =============
> 2007-03-28 02:32:18,076 INFO org.apache.hadoop.mapred.TaskTracker: Recieved RenitTrackerAction from JobTracker
> 2007-03-28 02:32:18,494 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.lang.NullPointerException
>   at org.apache.hadoop.mapred.TaskTracker$TaskInProgress.jobHasFinished(TaskTracker.java:1187)
>   at org.apache.hadoop.mapred.TaskTracker.close(TaskTracker.java:430)
>   at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:917)
>   at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:1589)
> JobTracker log shows ==============
> 2007-03-28 02:31:18,977 INFO org.apache.hadoop.mapred.JobTracker: Lost tracker '____.____.com'
> ...
> 2007-03-28 02:31:18,977 INFO org.apache.hadoop.mapred.JobInProgress: TaskTracker at 'tracker____.___.com' turned 'flaky'
> ...
> 2007-03-28 02:32:18,075 WARN org.apache.hadoop.mapred.JobTracker: Status from unknown Tracker : tracker____.___.com:#####

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1171) NPE in TaskTracker RenitTrackerAction

Posted by "Tahir Hashmi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12493992 ] 

Tahir Hashmi commented on HADOOP-1171:
--------------------------------------

I went through the code for to figure out why the NPE might have been triggered. It seems like jobHasFinished() might have issues with synchronisation. There are concurrent threads calling on this function and there's one call to TaskInProgress.cleanup() which removes TIPs from the tasks map and is not synchronised.

There may be a race condition when two threads try to cleanup() the same TIP. Need someone who's familiar with this code to comment on it.

> NPE in TaskTracker RenitTrackerAction 
> --------------------------------------
>
>                 Key: HADOOP-1171
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1171
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.12.1
>            Reporter: Koji Noguchi
>         Assigned To: Tahir Hashmi
>
> TaskTracker log shows =============
> 2007-03-28 02:32:18,076 INFO org.apache.hadoop.mapred.TaskTracker: Recieved RenitTrackerAction from JobTracker
> 2007-03-28 02:32:18,494 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.lang.NullPointerException
>   at org.apache.hadoop.mapred.TaskTracker$TaskInProgress.jobHasFinished(TaskTracker.java:1187)
>   at org.apache.hadoop.mapred.TaskTracker.close(TaskTracker.java:430)
>   at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:917)
>   at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:1589)
> JobTracker log shows ==============
> 2007-03-28 02:31:18,977 INFO org.apache.hadoop.mapred.JobTracker: Lost tracker '____.____.com'
> ...
> 2007-03-28 02:31:18,977 INFO org.apache.hadoop.mapred.JobInProgress: TaskTracker at 'tracker____.___.com' turned 'flaky'
> ...
> 2007-03-28 02:32:18,075 WARN org.apache.hadoop.mapred.JobTracker: Status from unknown Tracker : tracker____.___.com:#####

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1171) NPE in TaskTracker RenitTrackerAction

Posted by "Tahir Hashmi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12493348 ] 

Tahir Hashmi commented on HADOOP-1171:
--------------------------------------

Which version of Hadoop was this bug encountered in? The line numbers in the stack trace are no longer valid in the current configuration.

> NPE in TaskTracker RenitTrackerAction 
> --------------------------------------
>
>                 Key: HADOOP-1171
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1171
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Koji Noguchi
>         Assigned To: Tahir Hashmi
>
> TaskTracker log shows =============
> 2007-03-28 02:32:18,076 INFO org.apache.hadoop.mapred.TaskTracker: Recieved RenitTrackerAction from JobTracker
> 2007-03-28 02:32:18,494 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.lang.NullPointerException
>   at org.apache.hadoop.mapred.TaskTracker$TaskInProgress.jobHasFinished(TaskTracker.java:1187)
>   at org.apache.hadoop.mapred.TaskTracker.close(TaskTracker.java:430)
>   at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:917)
>   at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:1589)
> JobTracker log shows ==============
> 2007-03-28 02:31:18,977 INFO org.apache.hadoop.mapred.JobTracker: Lost tracker '____.____.com'
> ...
> 2007-03-28 02:31:18,977 INFO org.apache.hadoop.mapred.JobInProgress: TaskTracker at 'tracker____.___.com' turned 'flaky'
> ...
> 2007-03-28 02:32:18,075 WARN org.apache.hadoop.mapred.JobTracker: Status from unknown Tracker : tracker____.___.com:#####

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HADOOP-1171) NPE in TaskTracker RenitTrackerAction

Posted by "Tahir Hashmi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tahir Hashmi reassigned HADOOP-1171:
------------------------------------

    Assignee: Tahir Hashmi

> NPE in TaskTracker RenitTrackerAction 
> --------------------------------------
>
>                 Key: HADOOP-1171
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1171
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Koji Noguchi
>         Assigned To: Tahir Hashmi
>
> TaskTracker log shows =============
> 2007-03-28 02:32:18,076 INFO org.apache.hadoop.mapred.TaskTracker: Recieved RenitTrackerAction from JobTracker
> 2007-03-28 02:32:18,494 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.lang.NullPointerException
>   at org.apache.hadoop.mapred.TaskTracker$TaskInProgress.jobHasFinished(TaskTracker.java:1187)
>   at org.apache.hadoop.mapred.TaskTracker.close(TaskTracker.java:430)
>   at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:917)
>   at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:1589)
> JobTracker log shows ==============
> 2007-03-28 02:31:18,977 INFO org.apache.hadoop.mapred.JobTracker: Lost tracker '____.____.com'
> ...
> 2007-03-28 02:31:18,977 INFO org.apache.hadoop.mapred.JobInProgress: TaskTracker at 'tracker____.___.com' turned 'flaky'
> ...
> 2007-03-28 02:32:18,075 WARN org.apache.hadoop.mapred.JobTracker: Status from unknown Tracker : tracker____.___.com:#####

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.