You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Andrew Or (JIRA)" <ji...@apache.org> on 2015/07/26 07:57:05 UTC

[jira] [Closed] (SPARK-8881) Standalone mode scheduling fails because cores assignment is not atomic

     [ https://issues.apache.org/jira/browse/SPARK-8881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Or closed SPARK-8881.
----------------------------
    Resolution: Fixed

> Standalone mode scheduling fails because cores assignment is not atomic
> -----------------------------------------------------------------------
>
>                 Key: SPARK-8881
>                 URL: https://issues.apache.org/jira/browse/SPARK-8881
>             Project: Spark
>          Issue Type: Bug
>          Components: Deploy
>    Affects Versions: 1.4.0, 1.5.0
>            Reporter: Nishkam Ravi
>            Assignee: Nishkam Ravi
>            Priority: Critical
>             Fix For: 1.4.2, 1.5.0
>
>
> Current scheduling algorithm (in Master.scala) has two issues:
> 1. cores are allocated one at a time instead of spark.executor.cores at a time
> 2. when spark.cores.max/spark.executor.cores < num_workers, executors are not launched and the app hangs (due to 1)
> === Edit by Andrew ===
> Here's an example from the PR. Let's say we have 4 workers with 16 cores each. We set `spark.cores.max` to 48 and `spark.executor.cores` to 16. Because in spread out mode, the existing code allocates 1 core at a time, we end up allocating 12 cores on each worker, and no executors can be launched because each one wants at least 16 cores. Instead, we should allocate 16 cores at a time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org