You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2016/01/08 13:07:39 UTC

[jira] [Resolved] (SPARK-5552) Automated data science AMI creation and data science cluster deployment on EC2

     [ https://issues.apache.org/jira/browse/SPARK-5552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen resolved SPARK-5552.
------------------------------
    Resolution: Won't Fix

> Automated data science AMI creation and data science cluster deployment on EC2
> ------------------------------------------------------------------------------
>
>                 Key: SPARK-5552
>                 URL: https://issues.apache.org/jira/browse/SPARK-5552
>             Project: Spark
>          Issue Type: New Feature
>          Components: EC2
>            Reporter: Florian Verhein
>
> Issue created RE: https://github.com/mesos/spark-ec2/pull/90#issuecomment-72597154 (please read for background)
> Goal:
> Extend spark-ec2 scripts to create an automated data science cluster deployment on EC2, suitable for almost(?)-production use.
> Use cases: 
> - A user can build their own custom data science AMIs from a CentOS minimal image by calling a packer configuration (good defaults should be provided, some options for flexibility)
> - A user can then easily deploy a new (correctly configured) cluster using these AMIs, and do so as quickly as possible.
> Components/modules: Spark + tachyon + hdfs (on instance storage) + python + R + vowpal wabbit + any rpms + ... + ganglia
> Focus is on reliability (rather than e.g. supporting many versions / dev testing) and speed of deployment.
> Use hadoop 2 so option to lift into yarn later.
> My current solution is here: https://github.com/florianverhein/spark-ec2/tree/packer. It includes other fixes/improvements as needed to get it working.
> Now that it seems to work (but has deviated a lot more from the existing code base than I was expecting), I'm wondering what to do with it...
> Keen to hear ideas if anyone is interested. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org