You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Vinod K V (JIRA)" <ji...@apache.org> on 2009/09/02 13:22:32 UTC

[jira] Created: (HADOOP-6232) NPE in ProcfsBasedProcessTree.destroy()

NPE in ProcfsBasedProcessTree.destroy()
---------------------------------------

                 Key: HADOOP-6232
                 URL: https://issues.apache.org/jira/browse/HADOOP-6232
             Project: Hadoop Common
          Issue Type: Bug
            Reporter: Vinod K V
            Priority: Minor



This causes the following exception in TaskMemoryManagerThread. I observed this while running TestTaskTrackerMemoryManager.
{code}
2009-09-02 12:08:25,835 WARN  mapred.TaskMemoryManagerThread (TaskMemoryManagerThread.java:run(239)) - Uncaught exception in TaskMemoryManager while managing memory of attempt_20090902120812252_0001_m_000003_0 : java.lang.NullPointerException
        at org.apache.hadoop.util.ProcfsBasedProcessTree.assertPidPgrpidForMatch(ProcfsBasedProcessTree.java:234)
        at org.apache.hadoop.util.ProcfsBasedProcessTree.assertAndDestroyProcessGroup(ProcfsBasedProcessTree.java:257)
        at org.apache.hadoop.util.ProcfsBasedProcessTree.destroy(ProcfsBasedProcessTree.java:286)
        at org.apache.hadoop.mapred.TaskMemoryManagerThread.run(TaskMemoryManagerThread.java:229)
{code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6232) NPE in ProcfsBasedProcessTree.destroy()

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12752419#action_12752419 ] 

Vinod K V commented on HADOOP-6232:
-----------------------------------

Few comments:
 - HADOOP-6230 is already committed, we need a new patch that applies to mapred project for this issue.
 - I think it shouldn't be very difficult to directly test assertAndDestroyProcessGroup()/assertPidPgrpidForMatch()
by using the concept mock processes which is already being used in TestProcfsBasedProcessTree.
 - Minor:
    -- Fix the new comments/log statements introduced in this patch. They can be bettered/made more correct.
    -- Wrap around lines longer than 80 characters.

I've corrected the patch myself w.r.t HADOOP-6230 and can confirm that the NPE messages no longer appear in
TestTaskTrackerMemoryManager with this patch.

> NPE in ProcfsBasedProcessTree.destroy()
> ---------------------------------------
>
>                 Key: HADOOP-6232
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6232
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Vinod K V
>            Assignee: Ravi Gummadi
>         Attachments: HADOOP-6232.patch
>
>
> This causes the following exception in TaskMemoryManagerThread. I observed this while running TestTaskTrackerMemoryManager.
> {code}
> 2009-09-02 12:08:25,835 WARN  mapred.TaskMemoryManagerThread (TaskMemoryManagerThread.java:run(239)) - Uncaught exception in TaskMemoryManager while managing memory of attempt_20090902120812252_0001_m_000003_0 : java.lang.NullPointerException
>         at org.apache.hadoop.util.ProcfsBasedProcessTree.assertPidPgrpidForMatch(ProcfsBasedProcessTree.java:234)
>         at org.apache.hadoop.util.ProcfsBasedProcessTree.assertAndDestroyProcessGroup(ProcfsBasedProcessTree.java:257)
>         at org.apache.hadoop.util.ProcfsBasedProcessTree.destroy(ProcfsBasedProcessTree.java:286)
>         at org.apache.hadoop.mapred.TaskMemoryManagerThread.run(TaskMemoryManagerThread.java:229)
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6232) NPE in ProcfsBasedProcessTree.destroy()

Posted by "Ravi Gummadi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ravi Gummadi updated HADOOP-6232:
---------------------------------

    Attachment: HADOOP-6232.patch

Attaching patch that fixes the issue.

Please review and provide your comments.

> NPE in ProcfsBasedProcessTree.destroy()
> ---------------------------------------
>
>                 Key: HADOOP-6232
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6232
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Vinod K V
>         Attachments: HADOOP-6232.patch
>
>
> This causes the following exception in TaskMemoryManagerThread. I observed this while running TestTaskTrackerMemoryManager.
> {code}
> 2009-09-02 12:08:25,835 WARN  mapred.TaskMemoryManagerThread (TaskMemoryManagerThread.java:run(239)) - Uncaught exception in TaskMemoryManager while managing memory of attempt_20090902120812252_0001_m_000003_0 : java.lang.NullPointerException
>         at org.apache.hadoop.util.ProcfsBasedProcessTree.assertPidPgrpidForMatch(ProcfsBasedProcessTree.java:234)
>         at org.apache.hadoop.util.ProcfsBasedProcessTree.assertAndDestroyProcessGroup(ProcfsBasedProcessTree.java:257)
>         at org.apache.hadoop.util.ProcfsBasedProcessTree.destroy(ProcfsBasedProcessTree.java:286)
>         at org.apache.hadoop.mapred.TaskMemoryManagerThread.run(TaskMemoryManagerThread.java:229)
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6232) NPE in ProcfsBasedProcessTree.destroy()

Posted by "Ravi Gummadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12750403#action_12750403 ] 

Ravi Gummadi commented on HADOOP-6232:
--------------------------------------

Even though the session leader is gone, the child processes in that session can still be there and they need to be killed.

> NPE in ProcfsBasedProcessTree.destroy()
> ---------------------------------------
>
>                 Key: HADOOP-6232
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6232
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Vinod K V
>            Priority: Minor
>
> This causes the following exception in TaskMemoryManagerThread. I observed this while running TestTaskTrackerMemoryManager.
> {code}
> 2009-09-02 12:08:25,835 WARN  mapred.TaskMemoryManagerThread (TaskMemoryManagerThread.java:run(239)) - Uncaught exception in TaskMemoryManager while managing memory of attempt_20090902120812252_0001_m_000003_0 : java.lang.NullPointerException
>         at org.apache.hadoop.util.ProcfsBasedProcessTree.assertPidPgrpidForMatch(ProcfsBasedProcessTree.java:234)
>         at org.apache.hadoop.util.ProcfsBasedProcessTree.assertAndDestroyProcessGroup(ProcfsBasedProcessTree.java:257)
>         at org.apache.hadoop.util.ProcfsBasedProcessTree.destroy(ProcfsBasedProcessTree.java:286)
>         at org.apache.hadoop.mapred.TaskMemoryManagerThread.run(TaskMemoryManagerThread.java:229)
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HADOOP-6232) NPE in ProcfsBasedProcessTree.destroy()

Posted by "Ravi Gummadi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ravi Gummadi reassigned HADOOP-6232:
------------------------------------

    Assignee: Ravi Gummadi

> NPE in ProcfsBasedProcessTree.destroy()
> ---------------------------------------
>
>                 Key: HADOOP-6232
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6232
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Vinod K V
>            Assignee: Ravi Gummadi
>         Attachments: HADOOP-6232.patch
>
>
> This causes the following exception in TaskMemoryManagerThread. I observed this while running TestTaskTrackerMemoryManager.
> {code}
> 2009-09-02 12:08:25,835 WARN  mapred.TaskMemoryManagerThread (TaskMemoryManagerThread.java:run(239)) - Uncaught exception in TaskMemoryManager while managing memory of attempt_20090902120812252_0001_m_000003_0 : java.lang.NullPointerException
>         at org.apache.hadoop.util.ProcfsBasedProcessTree.assertPidPgrpidForMatch(ProcfsBasedProcessTree.java:234)
>         at org.apache.hadoop.util.ProcfsBasedProcessTree.assertAndDestroyProcessGroup(ProcfsBasedProcessTree.java:257)
>         at org.apache.hadoop.util.ProcfsBasedProcessTree.destroy(ProcfsBasedProcessTree.java:286)
>         at org.apache.hadoop.mapred.TaskMemoryManagerThread.run(TaskMemoryManagerThread.java:229)
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6232) NPE in ProcfsBasedProcessTree.destroy()

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12750400#action_12750400 ] 

Vinod K V commented on HADOOP-6232:
-----------------------------------

This is mostly a timing issue and happens when memory manager tries to destroy a process that is just gone. It didn't affect the testcase. The memory manager code doesn't propagate failures across its processing of multiple tasks. The side-effects seem to be mostly negligible. As we try to remove a task entry from the processTreeInfoMap map only after destroy succeeds. I think a task entry will be left in the map, but as we enough null checks in place, this process will just be skipped in further iterations.

> NPE in ProcfsBasedProcessTree.destroy()
> ---------------------------------------
>
>                 Key: HADOOP-6232
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6232
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Vinod K V
>            Priority: Minor
>
> This causes the following exception in TaskMemoryManagerThread. I observed this while running TestTaskTrackerMemoryManager.
> {code}
> 2009-09-02 12:08:25,835 WARN  mapred.TaskMemoryManagerThread (TaskMemoryManagerThread.java:run(239)) - Uncaught exception in TaskMemoryManager while managing memory of attempt_20090902120812252_0001_m_000003_0 : java.lang.NullPointerException
>         at org.apache.hadoop.util.ProcfsBasedProcessTree.assertPidPgrpidForMatch(ProcfsBasedProcessTree.java:234)
>         at org.apache.hadoop.util.ProcfsBasedProcessTree.assertAndDestroyProcessGroup(ProcfsBasedProcessTree.java:257)
>         at org.apache.hadoop.util.ProcfsBasedProcessTree.destroy(ProcfsBasedProcessTree.java:286)
>         at org.apache.hadoop.mapred.TaskMemoryManagerThread.run(TaskMemoryManagerThread.java:229)
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6232) NPE in ProcfsBasedProcessTree.destroy()

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vinod K V updated HADOOP-6232:
------------------------------

    Description: 
This causes the following exception in TaskMemoryManagerThread. I observed this while running TestTaskTrackerMemoryManager.
{code}
2009-09-02 12:08:25,835 WARN  mapred.TaskMemoryManagerThread (TaskMemoryManagerThread.java:run(239)) - Uncaught exception in TaskMemoryManager while managing memory of attempt_20090902120812252_0001_m_000003_0 : java.lang.NullPointerException
        at org.apache.hadoop.util.ProcfsBasedProcessTree.assertPidPgrpidForMatch(ProcfsBasedProcessTree.java:234)
        at org.apache.hadoop.util.ProcfsBasedProcessTree.assertAndDestroyProcessGroup(ProcfsBasedProcessTree.java:257)
        at org.apache.hadoop.util.ProcfsBasedProcessTree.destroy(ProcfsBasedProcessTree.java:286)
        at org.apache.hadoop.mapred.TaskMemoryManagerThread.run(TaskMemoryManagerThread.java:229)
{code}

  was:

This causes the following exception in TaskMemoryManagerThread. I observed this while running TestTaskTrackerMemoryManager.
{code}
2009-09-02 12:08:25,835 WARN  mapred.TaskMemoryManagerThread (TaskMemoryManagerThread.java:run(239)) - Uncaught exception in TaskMemoryManager while managing memory of attempt_20090902120812252_0001_m_000003_0 : java.lang.NullPointerException
        at org.apache.hadoop.util.ProcfsBasedProcessTree.assertPidPgrpidForMatch(ProcfsBasedProcessTree.java:234)
        at org.apache.hadoop.util.ProcfsBasedProcessTree.assertAndDestroyProcessGroup(ProcfsBasedProcessTree.java:257)
        at org.apache.hadoop.util.ProcfsBasedProcessTree.destroy(ProcfsBasedProcessTree.java:286)
        at org.apache.hadoop.mapred.TaskMemoryManagerThread.run(TaskMemoryManagerThread.java:229)
{code}

       Priority: Major  (was: Minor)

> NPE in ProcfsBasedProcessTree.destroy()
> ---------------------------------------
>
>                 Key: HADOOP-6232
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6232
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Vinod K V
>
> This causes the following exception in TaskMemoryManagerThread. I observed this while running TestTaskTrackerMemoryManager.
> {code}
> 2009-09-02 12:08:25,835 WARN  mapred.TaskMemoryManagerThread (TaskMemoryManagerThread.java:run(239)) - Uncaught exception in TaskMemoryManager while managing memory of attempt_20090902120812252_0001_m_000003_0 : java.lang.NullPointerException
>         at org.apache.hadoop.util.ProcfsBasedProcessTree.assertPidPgrpidForMatch(ProcfsBasedProcessTree.java:234)
>         at org.apache.hadoop.util.ProcfsBasedProcessTree.assertAndDestroyProcessGroup(ProcfsBasedProcessTree.java:257)
>         at org.apache.hadoop.util.ProcfsBasedProcessTree.destroy(ProcfsBasedProcessTree.java:286)
>         at org.apache.hadoop.mapred.TaskMemoryManagerThread.run(TaskMemoryManagerThread.java:229)
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.