You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2015/11/12 16:42:10 UTC

[jira] [Commented] (FLINK-3003) Add container allocation timeout to YARN CLI

    [ https://issues.apache.org/jira/browse/FLINK-3003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15002258#comment-15002258 ] 

ASF GitHub Bot commented on FLINK-3003:
---------------------------------------

Github user chiwanpark commented on the pull request:

    https://github.com/apache/flink/pull/1350#issuecomment-156142663
  
    Hi @HilmiYildirim, Thanks for opening pull request. I would like to shepherd this PR. Could you rename this PR to include JIRA issue number? I think "[FLINK-3003] Implement a parallel version of the Hidden Markov Model" would be better.


> Add container allocation timeout to YARN CLI
> --------------------------------------------
>
>                 Key: FLINK-3003
>                 URL: https://issues.apache.org/jira/browse/FLINK-3003
>             Project: Flink
>          Issue Type: Improvement
>          Components: YARN Client
>    Affects Versions: 0.10
>            Reporter: Ufuk Celebi
>             Fix For: 1.0, 0.10.1
>
>
> Programs submitted via {{bin/flink run -m yarn-cluster}} start a short-lived YARN sessions before submitting the job. The job is only submitted when all resources have been allocated. All allocated containers are "blocked" by the to be submitted job and the cluster is only partially allocated.
> If you have multiple submissions like this with partial allocations, you can block the whole YARN cluster (e.g. 10 containers in total and two sessions want 6 containers each and both have allocated 5).
> A simple work around for these situations is to add an allocation timeout after which the YARN sessions fails and releases all the resources.
> [Other strategies like wait for X amount of time for Y containers, but then go with what you have if you don't get all are also possible.]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)