You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Nicholas Chammas (JIRA)" <ji...@apache.org> on 2014/12/23 08:15:13 UTC

[jira] [Reopened] (SPARK-4325) Improve spark-ec2 cluster launch times

     [ https://issues.apache.org/jira/browse/SPARK-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nicholas Chammas reopened SPARK-4325:
-------------------------------------

Hey [~joshrosen], though [#3195|https://github.com/apache/spark/pull/3195] relates to this JIRA issue, it does not resolve it completely. There are several other improvements described here that have not been implemented yet.

In the future, should we try to have one PR match one JIRA issue? This issue could easily be an umbrella issue spanning several sub-tasks, one of which has been taken care of by the aforementioned PR.

> Improve spark-ec2 cluster launch times
> --------------------------------------
>
>                 Key: SPARK-4325
>                 URL: https://issues.apache.org/jira/browse/SPARK-4325
>             Project: Spark
>          Issue Type: Improvement
>          Components: EC2
>            Reporter: Nicholas Chammas
>            Assignee: Nicholas Chammas
>            Priority: Minor
>             Fix For: 1.3.0
>
>
> There are several optimizations we know we can make to [{{setup.sh}} | https://github.com/mesos/spark-ec2/blob/v4/setup.sh] to make cluster launches faster.
> There are also some improvements to the AMIs that will help a lot.
> Potential improvements:
> * Upgrade the Spark AMIs and pre-install tools like Ganglia on them. This will reduce or eliminate SSH wait time and Ganglia init time.
> * Replace instances of {{download; rsync to rest of cluster}} with parallel downloads on all nodes of the cluster.
> * Replace instances of 
>  {code}
> for node in $NODES; do
>   command
>   sleep 0.3
> done
> wait{code}
>  with simpler calls to {{pssh}}.
> * Remove the [linear backoff | https://github.com/apache/spark/blob/b32734e12d5197bad26c080e529edd875604c6fb/ec2/spark_ec2.py#L665] when we wait for SSH availability now that we are already waiting for EC2 status checks to clear before testing SSH.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org