You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-dev@hadoop.apache.org by "Arun Suresh (JIRA)" <ji...@apache.org> on 2017/08/23 18:28:00 UTC

[jira] [Created] (YARN-7086) Release all containers aynchronously

Arun Suresh created YARN-7086:
---------------------------------

             Summary: Release all containers aynchronously
                 Key: YARN-7086
                 URL: https://issues.apache.org/jira/browse/YARN-7086
             Project: Hadoop YARN
          Issue Type: Bug
          Components: resourcemanager
            Reporter: Arun Suresh
            Assignee: Arun Suresh


We have noticed in production two situations that can cause deadlocks and cause scheduling of new containers to come to a halt, especially with regard to applications that have a lot of live containers:
# When these applicaitons release these containers in bulk.
# When these applications terminate abruptly due to some failure, the scheduler releases all its live containers in a loop.

To handle the issues mentioned above, we have a patch in production to make sure ALL container releases happen asynchronously - and it has served us well.

Opening this JIRA to gather feedback on if this is a good idea generally (cc [~leftnoteasy], [~jlowe], [~curino], [~kasha], [~subru], [~roniburd])

BTW, In YARN-6251, we already have an asyncReleaseContainer() in the AbstractYarnScheduler and a corresponding scheduler event, which is currently used specifically for the container-update code paths (where the scheduler realeases temp containers which it creates for the update)





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org