You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@myriad.apache.org by "DarinJ (JIRA)" <ji...@apache.org> on 2016/02/03 22:39:39 UTC

[jira] [Commented] (MYRIAD-153) Placeholder tasks yarn_container_* is not cleaned after yarn job is complete.

    [ https://issues.apache.org/jira/browse/MYRIAD-153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131200#comment-15131200 ] 

DarinJ commented on MYRIAD-153:
-------------------------------

I've spend quite a bit of time debugging this issue and believe I found the root cause and a solution.  The root cause is that if a YARN container goes from ALLOCATED to RELEASED the AuxService Class is not aware of the container and therefore never killed.  The proposed solution is to intercept the (public) method completeContainers in the scheduler and check if we need to remove a myriad task when receiving a RELEASED event.  I've already added the intercept as part of the debugging set so should be able to have this patched soon.

> Placeholder tasks yarn_container_* is not cleaned after yarn job is complete.
> -----------------------------------------------------------------------------
>
>                 Key: MYRIAD-153
>                 URL: https://issues.apache.org/jira/browse/MYRIAD-153
>             Project: Myriad
>          Issue Type: Bug
>            Reporter: Sarjeet Singh
>            Assignee: DarinJ
>             Fix For: Myriad 0.2.0
>
>         Attachments: Mesos_UI_screeshot_placeholder_tasks_running.png
>
>
> Observed the placeholder tasks for containers launched on FGS are still in RUNNING state on mesos. These container tasks are not cleaned up properly after job is finished completely.
> see screenshot attached for mesos UI with placeholder tasks still running.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)