You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-dev@hadoop.apache.org by "Chris Riccomini (JIRA)" <ji...@apache.org> on 2011/09/22 18:39:27 UTC

[jira] [Created] (MAPREDUCE-3072) NodeManager doesn't recognize kill -9 of AM container

NodeManager doesn't recognize kill -9 of AM container
-----------------------------------------------------

                 Key: MAPREDUCE-3072
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3072
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: nodemanager
    Affects Versions: 0.23.0
         Environment: [criccomi@criccomi-ld trunk]$ svn info
Path: .
URL: http://svn.apache.org/repos/asf/hadoop/common/trunk
Repository Root: http://svn.apache.org/repos/asf
Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
Revision: 1174189
Node Kind: directory
Schedule: normal
Last Changed Author: szetszwo
Last Changed Rev: 1173990
Last Changed Date: 2011-09-22 01:25:20 -0700 (Thu, 22 Sep 2011)

            Reporter: Chris Riccomini


If I kill -9 my application master's pid, the NM continues reporting that the container is running. I assume it should probably instead report back to the RM that the AM has died. Instead, it continues sending this status:


2011-09-22 09:33:13,352 INFO  nodemanager.NodeStatusUpdaterImpl (NodeStatusUpdaterImpl.java:getNodeStatus(222)) - Sending out status for container: container_id {, app_attempt_id {, application_id {, id: 1, cluster_timestamp: 1316707951832, }, attemptId: 1, }, id: 1, }, state: C_RUNNING, diagnostics: "\n", exit_status: -1000, 

2011-09-22 09:33:13,682 INFO  monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(402)) - Memory usage of ProcessTree 27263 for container-id container_1316707951832_0001_01_000001 : Virtual 0 bytes, limit : 2147483648 bytes; Physical 0 bytes, limit -1 bytes

This status keeps being sent forever.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (MAPREDUCE-3072) NodeManager doesn't recognize kill -9 of AM container

Posted by "Vinod Kumar Vavilapalli (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/MAPREDUCE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vinod Kumar Vavilapalli resolved MAPREDUCE-3072.
------------------------------------------------

    Resolution: Duplicate

This is the same as MAPREDUCE-3031.

> NodeManager doesn't recognize kill -9 of AM container
> -----------------------------------------------------
>
>                 Key: MAPREDUCE-3072
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3072
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 0.23.0
>         Environment: [criccomi@criccomi-ld trunk]$ svn info
> Path: .
> URL: http://svn.apache.org/repos/asf/hadoop/common/trunk
> Repository Root: http://svn.apache.org/repos/asf
> Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
> Revision: 1174189
> Node Kind: directory
> Schedule: normal
> Last Changed Author: szetszwo
> Last Changed Rev: 1173990
> Last Changed Date: 2011-09-22 01:25:20 -0700 (Thu, 22 Sep 2011)
>            Reporter: Chris Riccomini
>
> If I kill -9 my application master's pid, the NM continues reporting that the container is running. I assume it should probably instead report back to the RM that the AM has died. Instead, it continues sending this status:
> 2011-09-22 09:33:13,352 INFO  nodemanager.NodeStatusUpdaterImpl (NodeStatusUpdaterImpl.java:getNodeStatus(222)) - Sending out status for container: container_id {, app_attempt_id {, application_id {, id: 1, cluster_timestamp: 1316707951832, }, attemptId: 1, }, id: 1, }, state: C_RUNNING, diagnostics: "\n", exit_status: -1000, 
> 2011-09-22 09:33:13,682 INFO  monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(402)) - Memory usage of ProcessTree 27263 for container-id container_1316707951832_0001_01_000001 : Virtual 0 bytes, limit : 2147483648 bytes; Physical 0 bytes, limit -1 bytes
> This status keeps being sent forever.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira