You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2015/05/16 13:54:00 UTC

[jira] [Updated] (SPARK-4325) Improve spark-ec2 cluster launch times

     [ https://issues.apache.org/jira/browse/SPARK-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen updated SPARK-4325:
-----------------------------
    Fix Version/s:     (was: 1.3.0)

> Improve spark-ec2 cluster launch times
> --------------------------------------
>
>                 Key: SPARK-4325
>                 URL: https://issues.apache.org/jira/browse/SPARK-4325
>             Project: Spark
>          Issue Type: Umbrella
>          Components: EC2
>            Reporter: Nicholas Chammas
>            Assignee: Nicholas Chammas
>            Priority: Minor
>
> This is an umbrella task to capture several pieces of work related to significantly improving spark-ec2 cluster launch times.
> There are several optimizations we know we can make to [{{setup.sh}} | https://github.com/mesos/spark-ec2/blob/v4/setup.sh] to make cluster launches faster.
> There are also some improvements to the AMIs that will help a lot.
> Potential improvements:
> * Upgrade the Spark AMIs and pre-install tools like Ganglia on them. This will reduce or eliminate SSH wait time and Ganglia init time.
> * Replace instances of {{download; rsync to rest of cluster}} with parallel downloads on all nodes of the cluster.
> * Replace instances of 
>  {code}
> for node in $NODES; do
>   command
>   sleep 0.3
> done
> wait{code}
>  with simpler calls to {{pssh}}.
> * Remove the [linear backoff | https://github.com/apache/spark/blob/b32734e12d5197bad26c080e529edd875604c6fb/ec2/spark_ec2.py#L665] when we wait for SSH availability now that we are already waiting for EC2 status checks to clear before testing SSH.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org