You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sandy Ryza (JIRA)" <ji...@apache.org> on 2014/11/04 19:00:44 UTC

[jira] [Comment Edited] (SPARK-4214) With dynamic allocation, avoid outstanding requests for more executors than pending tasks need

    [ https://issues.apache.org/jira/browse/SPARK-4214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14196464#comment-14196464 ] 

Sandy Ryza edited comment on SPARK-4214 at 11/4/14 6:00 PM:
------------------------------------------------------------

We can implement this in either a "weak" way or a "strong" way.
* Weak: whenever we would request additional executors, avoid requesting more than pending tasks need.
* Strong: In addition to the above, when pending tasks go below the capacity of outstanding requests, cancel outstanding requests.

My opinion is that the strong way is worthwhile.  It allows us to behave well in scenarios like:
* Due to contention on the cluster, an app can get no more than X executors from YARN
* As a long job runs, we make lots of executors requests to YARN.  At some point, YARN stops fulfilling additional requests.
* The long job completes.

With cancellation, we can avoid
* Surges of unneeded executors when contention on the cluster goes away
* New executors popping back up as we kill our own for being idly
 

[~pwendell] [~andrewor] do you have any thoughts?


was (Author: sandyr):
We can implement this in either a "weak" way or a "strong" way.
* Weak: whenever we would request additional executors, avoid requesting more than pending tasks need.
* Strong: In addition to the above, when pending tasks go below the capacity of outstanding requests, cancel outstanding requests.

My opinion is that the strong way is worthwhile.  It allows us to behave well in scenarios like:
* Due to contention on the cluster, an app can get no more than X executors from YARN
* As a long job runs, we make lots of executors requests to YARN.  At some point, YARN stops fulfilling additional requests.
* The long job completes.
With cancellation, we can avoid
* Surges of unneeded executors when contention on the cluster goes away
* New executors popping back up as we kill our own for being idly
 

[~pwendell] [~andrewor] do you have any thoughts?

> With dynamic allocation, avoid outstanding requests for more executors than pending tasks need
> ----------------------------------------------------------------------------------------------
>
>                 Key: SPARK-4214
>                 URL: https://issues.apache.org/jira/browse/SPARK-4214
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core, YARN
>    Affects Versions: 1.2.0
>            Reporter: Sandy Ryza
>            Assignee: Sandy Ryza
>
> Dynamic allocation tries to allocate more executors while we have pending tasks remaining.  Our current policy can end up with more outstanding executor requests than needed to fulfill all the pending tasks.  Capping the executor requests to the number of cores needed to fulfill all pending tasks would make dynamic allocation behavior less sensitive to settings for maxExecutors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org