You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2014/11/11 01:13:34 UTC

[jira] [Commented] (SPARK-4325) Improve spark-ec2 cluster launch times

    [ https://issues.apache.org/jira/browse/SPARK-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14205627#comment-14205627 ] 

Apache Spark commented on SPARK-4325:
-------------------------------------

User 'nchammas' has created a pull request for this issue:
https://github.com/apache/spark/pull/3195

> Improve spark-ec2 cluster launch times
> --------------------------------------
>
>                 Key: SPARK-4325
>                 URL: https://issues.apache.org/jira/browse/SPARK-4325
>             Project: Spark
>          Issue Type: Improvement
>          Components: EC2
>            Reporter: Nicholas Chammas
>            Priority: Minor
>
> There are several optimizations we know we can make to [{{setup.sh}} | https://github.com/mesos/spark-ec2/blob/v4/setup.sh] to make cluster launches faster.
> There are also some improvements to the AMIs that will help a lot.
> Potential improvements:
> * Upgrade the Spark AMIs and pre-install tools like Ganglia on them. This will reduce or eliminate SSH wait time and Ganglia init time.
> * Replace instances of {{download; rsync to rest of cluster}} with parallel downloads on all nodes of the cluster.
> * Replace instances of 
>  {code}
> for node in $NODES; do
>   command
>   sleep 0.3
> done
> wait{code}
>  with simpler calls to {{pssh}}.
> * Remove the [linear backoff | https://github.com/apache/spark/blob/b32734e12d5197bad26c080e529edd875604c6fb/ec2/spark_ec2.py#L665] when we wait for SSH availability now that we are already waiting for EC2 status checks to clear before testing SSH.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org