You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "lyc (JIRA)" <ji...@apache.org> on 2017/06/02 07:36:04 UTC

[jira] [Comment Edited] (SPARK-20662) Block jobs that have greater than a configured number of tasks

    [ https://issues.apache.org/jira/browse/SPARK-20662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16034282#comment-16034282 ] 

lyc edited comment on SPARK-20662 at 6/2/17 7:36 AM:
-----------------------------------------------------

Do you mean `mapreduce.job.running.map.limit`? The conf means `The maximum number of simultaneous map tasks per job. There is no limit if this value is 0 or negative.` 

This means task concurrency. And the behavior seems to be that stops scheduling tasks when job has that many running tasks, and starts scheduling when some tasks are done.

This seems can be done in `DAGScheduler`, I'd like give it a try if the idea is accepted.  cc [~vanzin]


was (Author: lyc):
Do you mean `mapreduce.job.running.map.limit`? The conf means `The maximum number of simultaneous map tasks per job. There is no limit if this value is 0 or negative.` 

This means task concurrency. And the behavior seems to be that stops scheduling tasks when job has that many running tasks, and starts scheduling when some tasks are done.

This seems can be done in `DAGScheduler`, I'd like give it a try if the idea is accepted.  cc @Marcelo Vanzin

> Block jobs that have greater than a configured number of tasks
> --------------------------------------------------------------
>
>                 Key: SPARK-20662
>                 URL: https://issues.apache.org/jira/browse/SPARK-20662
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 1.6.0, 2.0.0
>            Reporter: Xuefu Zhang
>
> In a shared cluster, it's desirable for an admin to block large Spark jobs. While there might not be a single metrics defining the size of a job, the number of tasks is usually a good indicator. Thus, it would be useful for Spark scheduler to block a job whose number of tasks reaches a configured limit. By default, the limit could be just infinite, to retain the existing behavior.
> MapReduce has mapreduce.job.max.map and mapreduce.job.max.reduce to be configured, which blocks a MR job at job submission time.
> The proposed configuration is spark.job.max.tasks with a default value -1 (infinite).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org