You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Brian Wongchaowart (JIRA)" <ji...@apache.org> on 2016/03/10 15:28:40 UTC

[jira] [Created] (SPARK-13803) Standalone master does not balance cluster-mode drivers across workers

Brian Wongchaowart created SPARK-13803:
------------------------------------------

             Summary: Standalone master does not balance cluster-mode drivers across workers
                 Key: SPARK-13803
                 URL: https://issues.apache.org/jira/browse/SPARK-13803
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
    Affects Versions: 1.6.1
            Reporter: Brian Wongchaowart


The Spark standalone cluster master does not balance drivers running in cluster mode across all the available workers. Instead, it assigns each submitted driver to the first available worker. The schedule() method attempts to randomly shuffle the HashSet of workers before launching drivers, but that operation has no effect because the Scala HashSet is an unordered data structure. This behavior is a regression introduced by SPARK-1706: previously, the workers were copied into an ordered list before the random shuffle is performed.

I am able to reproduce this bug in all releases of Spark from 1.4.0 to 1.6.1 using the following steps:

# Start a standalone master and two workers
# Repeatedly submit applications to the master in cluster mode (--deploy-mode cluster)

Observe that all the drivers are scheduled on only one of the two workers as long as resources are available on that worker. The expected behavior is that the master randomly assigns drivers to both workers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org