You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Joseph (JIRA)" <ji...@apache.org> on 2015/11/04 19:50:28 UTC

[jira] [Updated] (YARN-4331) Killing NodeManager leaves orphaned containers

     [ https://issues.apache.org/jira/browse/YARN-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joseph updated YARN-4331:
-------------------------
    Description: 
We are seeing a lot of orphaned containers running in our production clusters.
I tried to simulate this locally on my machine and can replicate the issue by killing nodemanager.
I'm running Yarn 2.7.1 with RM state stored in zookeeper and deploying samza jobs.
Steps:
{quote}1. Deploy a job 
2. Issue a kill -9 signal to nodemanager 
3. We should see the AM and its container running without nodemanager
4. AM should die but the container still keeps running
5. Restarting nodemanager brings up new AM and container but leaves the orphaned container running in the background
{quote}
This is effectively causing double processing of data.


  was:
We are seeing a lot of orphaned containers running in our production clusters.
I tried to simulate this locally on my machine and can replicate the issue by killing nodemanager.
I'm running Yarn 2.7.1 with RM state stored in zookeeper and deploying samza jobs.
Steps:
1. Deploy a job 
2. Issue a kill -9 signal to nodemanager 
3. We should see the AM and its container running without nodemanager
4. AM should die but the container still keeps running
5. Restarting nodemanager brings up new AM and container but leaves the orphaned container running in the background

This is effectively causing double processing of data.



> Killing NodeManager leaves orphaned containers
> ----------------------------------------------
>
>                 Key: YARN-4331
>                 URL: https://issues.apache.org/jira/browse/YARN-4331
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager, yarn
>    Affects Versions: 2.7.1
>            Reporter: Joseph
>            Priority: Critical
>
> We are seeing a lot of orphaned containers running in our production clusters.
> I tried to simulate this locally on my machine and can replicate the issue by killing nodemanager.
> I'm running Yarn 2.7.1 with RM state stored in zookeeper and deploying samza jobs.
> Steps:
> {quote}1. Deploy a job 
> 2. Issue a kill -9 signal to nodemanager 
> 3. We should see the AM and its container running without nodemanager
> 4. AM should die but the container still keeps running
> 5. Restarting nodemanager brings up new AM and container but leaves the orphaned container running in the background
> {quote}
> This is effectively causing double processing of data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)