You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Eric Badger (JIRA)" <ji...@apache.org> on 2017/10/26 20:56:00 UTC

[jira] [Issue Comment Deleted] (YARN-7395) NM fails to successfully kill tasks that run over their memory limit

     [ https://issues.apache.org/jira/browse/YARN-7395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Eric Badger updated YARN-7395:
------------------------------
    Comment: was deleted

(was: {noformat}
Oct 26 15:49:46 fsta100n11.tan.ygrid.yahoo.com dockerd-current: time="2017-10-26T15:49:46.287169432Z" level=error msg="Handler for POST /v1.24/containers/%27container_e127_1508997850588_0001_02_000001%27/stop?t=10 returned error: No such container: 'container_e127_1508997850588_0001_02_000001'"
Oct 26 15:49:46 fsta100n11.tan.ygrid.yahoo.com dockerd-current: time="2017-10-26T15:49:46.287193005Z" level=error msg="Handler for POST /v1.24/containers/'container_e127_1508997850588_0001_02_000001'/stop returned error: No such container: 'container_e127_1508997850588_0001_02_000001'"
{noformat}
Update: Looks like the docker stop command is failing because it's including the {{'}} in the container name. It ends up not finding the container because of that, which is why it fails with exit code 1. Not sure if this will be a problem in branch-2/trunk because of the refactoring that came with YARN-6623. Our internal branch has not currently pulled back YARN-6623. )

> NM fails to successfully kill tasks that run over their memory limit
> --------------------------------------------------------------------
>
>                 Key: YARN-7395
>                 URL: https://issues.apache.org/jira/browse/YARN-7395
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: yarn
>            Reporter: Eric Badger
>
> The NM correctly notes that the container is over its configured limit, but then fails to successfully kill the process. So the Docker container AM stays around and the job keeps running



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org