You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Patrick Wendell <pw...@gmail.com> on 2015/08/01 03:50:31 UTC

Re: Should spark-ec2 get its own repo?

Hey All,

I've mostly kept quiet since I am not very active in maintaining this
code anymore. However, it is a bit odd that the project is
split-brained with a lot of the code being on github and some in the
Spark repo.

If the consensus is to migrate everything to github, that seems okay
with me. I would vouch for having user continuity, for instance still
have a "shim" ec2/spark-ec2 script that could perhaps just download
and unpack the real script from github.

- Patrick

On Fri, Jul 31, 2015 at 2:13 PM, Shivaram Venkataraman
<sh...@eecs.berkeley.edu> wrote:
> Yes - It is still in progress, but I have just not gotten time to get to
> this. I think getting the repo moved from mesos to amplab in the codebase by
> 1.5 should be possible.
>
> Thanks
> Shivaram
>
> On Fri, Jul 31, 2015 at 3:08 AM, Sean Owen <so...@cloudera.com> wrote:
>>
>> PS is this still in progress? it feels like something that would be
>> good to do before 1.5.0, if it's going to happen soon.
>>
>> On Wed, Jul 22, 2015 at 6:59 AM, Shivaram Venkataraman
>> <sh...@eecs.berkeley.edu> wrote:
>> > Yeah I'll send a note to the mesos dev list just to make sure they are
>> > informed.
>> >
>> > Shivaram
>> >
>> > On Tue, Jul 21, 2015 at 11:47 AM, Sean Owen <so...@cloudera.com> wrote:
>> >>
>> >> I agree it's worth informing Mesos devs and checking that there are no
>> >> big objections. I presume Shivaram is plugged in enough to Mesos that
>> >> there won't be any surprises there, and that the project would also
>> >> agree with moving this Spark-specific bit out. they may also want to
>> >> leave a pointer to the new location in the mesos repo of course.
>> >>
>> >> I don't think it is something that requires a formal vote. It's not a
>> >> question of ownership -- neither Apache nor the project PMC owns the
>> >> code. I don't think it's different from retiring or removing any other
>> >> code.
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> On Tue, Jul 21, 2015 at 7:03 PM, Mridul Muralidharan <mr...@gmail.com>
>> >> wrote:
>> >> > If I am not wrong, since the code was hosted within mesos project
>> >> > repo, I assume (atleast part of it) is owned by mesos project and so
>> >> > its PMC ?
>> >> >
>> >> > - Mridul
>> >> >
>> >> > On Tue, Jul 21, 2015 at 9:22 AM, Shivaram Venkataraman
>> >> > <sh...@eecs.berkeley.edu> wrote:
>> >> >> There is technically no PMC for the spark-ec2 project (I guess we
>> >> >> are
>> >> >> kind
>> >> >> of establishing one right now). I haven't heard anything from the
>> >> >> Spark
>> >> >> PMC
>> >> >> on the dev list that might suggest a need for a vote so far. I will
>> >> >> send
>> >> >> another round of email notification to the dev list when we have a
>> >> >> JIRA
>> >> >> / PR
>> >> >> that actually moves the scripts (right now the only thing that
>> >> >> changed
>> >> >> is
>> >> >> the location of some scripts in mesos/ to amplab/).
>> >> >>
>> >> >> Thanks
>> >> >> Shivaram
>> >> >>
>> >
>> >
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: Should spark-ec2 get its own repo?

Posted by Jeremy Freeman <fr...@gmail.com>.
Hi all, definitely a +1 to this plan.

Wanted to also share this library for Spark + GCE by a collaborator of mine, Michael Broxton, which seems to expand and improve on the earlier one Nick pointed us to. It’s pip installable, not yet on spark-packages, but I’m sure he’d be game to add it.

https://github.com/broxtronix/spark-gce <https://github.com/broxtronix/spark-gce>

> On Aug 3, 2015, at 1:25 PM, Shivaram Venkataraman <sh...@eecs.berkeley.edu> wrote:
> 
> I sent a note to the Mesos developers and created
> https://github.com/apache/spark/pull/7899 to change the repository
> pointer. There are 3-4 open PRs right now in the mesos/spark-ec2
> repository and I'll work on migrating them to amplab/spark-ec2 later
> today.
> 
> My thoughts on moving the python script is that we should have a
> wrapper shell script that just fetches the latest version of
> spark_ec2.py for the corresponding Spark branch. We already have
> separate branches in our spark-ec2 repository for different Spark
> versions so it can just be a call to `wget
> https://github.com/amplab/spark-ec2/tree/<spark-version>/driver/spark_ec2.py`.
> 
> Thanks
> Shivaram
> 
> On Sun, Aug 2, 2015 at 11:34 AM, Nicholas Chammas
> <ni...@gmail.com> wrote:
>> On Sat, Aug 1, 2015 at 1:09 PM Matt Goodman <me...@gmail.com> wrote:
>>> 
>>> I am considering porting some of this to a more general spark-cloud
>>> launcher, including google/aliyun/rackspace.  It shouldn't be hard at all
>>> given the current approach for setup/install.
>> 
>> 
>> FWIW, there are already some tools for launching Spark clusters on GCE and
>> Azure:
>> 
>> http://spark-packages.org/?q=tags%3A%22Deployment%22
>> 
>> Nick
>> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
> 


Re: Should spark-ec2 get its own repo?

Posted by Shivaram Venkataraman <sh...@eecs.berkeley.edu>.
I sent a note to the Mesos developers and created
https://github.com/apache/spark/pull/7899 to change the repository
pointer. There are 3-4 open PRs right now in the mesos/spark-ec2
repository and I'll work on migrating them to amplab/spark-ec2 later
today.

My thoughts on moving the python script is that we should have a
wrapper shell script that just fetches the latest version of
spark_ec2.py for the corresponding Spark branch. We already have
separate branches in our spark-ec2 repository for different Spark
versions so it can just be a call to `wget
https://github.com/amplab/spark-ec2/tree/<spark-version>/driver/spark_ec2.py`.

Thanks
Shivaram

On Sun, Aug 2, 2015 at 11:34 AM, Nicholas Chammas
<ni...@gmail.com> wrote:
> On Sat, Aug 1, 2015 at 1:09 PM Matt Goodman <me...@gmail.com> wrote:
>>
>> I am considering porting some of this to a more general spark-cloud
>> launcher, including google/aliyun/rackspace.  It shouldn't be hard at all
>> given the current approach for setup/install.
>
>
> FWIW, there are already some tools for launching Spark clusters on GCE and
> Azure:
>
> http://spark-packages.org/?q=tags%3A%22Deployment%22
>
> Nick
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: Should spark-ec2 get its own repo?

Posted by Nicholas Chammas <ni...@gmail.com>.
On Sat, Aug 1, 2015 at 1:09 PM Matt Goodman <me...@gmail.com> wrote:

> I am considering porting some of this to a more general spark-cloud
> launcher, including google/aliyun/rackspace.  It shouldn't be hard at all
> given the current approach for setup/install.
>

FWIW, there are already some tools for launching Spark clusters on GCE and
Azure:

http://spark-packages.org/?q=tags%3A%22Deployment%22

Nick

Re: Should spark-ec2 get its own repo?

Posted by Josh Rosen <jo...@databricks.com>.
I don't think that using git submodules is a good idea here:

   - The extra `git submodule init && git submodule update` step can lead
   to confusing problems in certain workflows.
   - We'd wind up with many commits that serve only to bump the submodule
   SHA; these commits will be hard to review since they won't contain line
   diffs (the author will have to manually provide a link to the diff of code
   changes).


On Sat, Aug 1, 2015 at 10:08 AM, Matt Goodman <me...@gmail.com> wrote:

> I think that is a good idea, and slated to happen.  At the very least a
> README or some such.  Is this a use case for git submodules?  I am
> considering porting some of this to a more general spark-cloud launcher,
> including google/aliyun/rackspace.  It shouldn't be hard at all given the
> current approach for setup/install.
>
> --Matthew Goodman
>
> =====================
> Check Out My Website: http://craneium.net
> Find me on LinkedIn: http://tinyurl.com/d6wlch
>
> On Fri, Jul 31, 2015 at 6:50 PM, Patrick Wendell <pw...@gmail.com>
> wrote:
>
>> Hey All,
>>
>> I've mostly kept quiet since I am not very active in maintaining this
>> code anymore. However, it is a bit odd that the project is
>> split-brained with a lot of the code being on github and some in the
>> Spark repo.
>>
>> If the consensus is to migrate everything to github, that seems okay
>> with me. I would vouch for having user continuity, for instance still
>> have a "shim" ec2/spark-ec2 script that could perhaps just download
>> and unpack the real script from github.
>>
>> - Patrick
>>
>> On Fri, Jul 31, 2015 at 2:13 PM, Shivaram Venkataraman
>> <sh...@eecs.berkeley.edu> wrote:
>> > Yes - It is still in progress, but I have just not gotten time to get to
>> > this. I think getting the repo moved from mesos to amplab in the
>> codebase by
>> > 1.5 should be possible.
>> >
>> > Thanks
>> > Shivaram
>> >
>> > On Fri, Jul 31, 2015 at 3:08 AM, Sean Owen <so...@cloudera.com> wrote:
>> >>
>> >> PS is this still in progress? it feels like something that would be
>> >> good to do before 1.5.0, if it's going to happen soon.
>> >>
>> >> On Wed, Jul 22, 2015 at 6:59 AM, Shivaram Venkataraman
>> >> <sh...@eecs.berkeley.edu> wrote:
>> >> > Yeah I'll send a note to the mesos dev list just to make sure they
>> are
>> >> > informed.
>> >> >
>> >> > Shivaram
>> >> >
>> >> > On Tue, Jul 21, 2015 at 11:47 AM, Sean Owen <so...@cloudera.com>
>> wrote:
>> >> >>
>> >> >> I agree it's worth informing Mesos devs and checking that there are
>> no
>> >> >> big objections. I presume Shivaram is plugged in enough to Mesos
>> that
>> >> >> there won't be any surprises there, and that the project would also
>> >> >> agree with moving this Spark-specific bit out. they may also want to
>> >> >> leave a pointer to the new location in the mesos repo of course.
>> >> >>
>> >> >> I don't think it is something that requires a formal vote. It's not
>> a
>> >> >> question of ownership -- neither Apache nor the project PMC owns the
>> >> >> code. I don't think it's different from retiring or removing any
>> other
>> >> >> code.
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >> On Tue, Jul 21, 2015 at 7:03 PM, Mridul Muralidharan <
>> mridul@gmail.com>
>> >> >> wrote:
>> >> >> > If I am not wrong, since the code was hosted within mesos project
>> >> >> > repo, I assume (atleast part of it) is owned by mesos project and
>> so
>> >> >> > its PMC ?
>> >> >> >
>> >> >> > - Mridul
>> >> >> >
>> >> >> > On Tue, Jul 21, 2015 at 9:22 AM, Shivaram Venkataraman
>> >> >> > <sh...@eecs.berkeley.edu> wrote:
>> >> >> >> There is technically no PMC for the spark-ec2 project (I guess we
>> >> >> >> are
>> >> >> >> kind
>> >> >> >> of establishing one right now). I haven't heard anything from the
>> >> >> >> Spark
>> >> >> >> PMC
>> >> >> >> on the dev list that might suggest a need for a vote so far. I
>> will
>> >> >> >> send
>> >> >> >> another round of email notification to the dev list when we have
>> a
>> >> >> >> JIRA
>> >> >> >> / PR
>> >> >> >> that actually moves the scripts (right now the only thing that
>> >> >> >> changed
>> >> >> >> is
>> >> >> >> the location of some scripts in mesos/ to amplab/).
>> >> >> >>
>> >> >> >> Thanks
>> >> >> >> Shivaram
>> >> >> >>
>> >> >
>> >> >
>> >
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>> For additional commands, e-mail: dev-help@spark.apache.org
>>
>>
>

Re: Should spark-ec2 get its own repo?

Posted by Matt Goodman <me...@gmail.com>.
I think that is a good idea, and slated to happen.  At the very least a
README or some such.  Is this a use case for git submodules?  I am
considering porting some of this to a more general spark-cloud launcher,
including google/aliyun/rackspace.  It shouldn't be hard at all given the
current approach for setup/install.

--Matthew Goodman

=====================
Check Out My Website: http://craneium.net
Find me on LinkedIn: http://tinyurl.com/d6wlch

On Fri, Jul 31, 2015 at 6:50 PM, Patrick Wendell <pw...@gmail.com> wrote:

> Hey All,
>
> I've mostly kept quiet since I am not very active in maintaining this
> code anymore. However, it is a bit odd that the project is
> split-brained with a lot of the code being on github and some in the
> Spark repo.
>
> If the consensus is to migrate everything to github, that seems okay
> with me. I would vouch for having user continuity, for instance still
> have a "shim" ec2/spark-ec2 script that could perhaps just download
> and unpack the real script from github.
>
> - Patrick
>
> On Fri, Jul 31, 2015 at 2:13 PM, Shivaram Venkataraman
> <sh...@eecs.berkeley.edu> wrote:
> > Yes - It is still in progress, but I have just not gotten time to get to
> > this. I think getting the repo moved from mesos to amplab in the
> codebase by
> > 1.5 should be possible.
> >
> > Thanks
> > Shivaram
> >
> > On Fri, Jul 31, 2015 at 3:08 AM, Sean Owen <so...@cloudera.com> wrote:
> >>
> >> PS is this still in progress? it feels like something that would be
> >> good to do before 1.5.0, if it's going to happen soon.
> >>
> >> On Wed, Jul 22, 2015 at 6:59 AM, Shivaram Venkataraman
> >> <sh...@eecs.berkeley.edu> wrote:
> >> > Yeah I'll send a note to the mesos dev list just to make sure they are
> >> > informed.
> >> >
> >> > Shivaram
> >> >
> >> > On Tue, Jul 21, 2015 at 11:47 AM, Sean Owen <so...@cloudera.com>
> wrote:
> >> >>
> >> >> I agree it's worth informing Mesos devs and checking that there are
> no
> >> >> big objections. I presume Shivaram is plugged in enough to Mesos that
> >> >> there won't be any surprises there, and that the project would also
> >> >> agree with moving this Spark-specific bit out. they may also want to
> >> >> leave a pointer to the new location in the mesos repo of course.
> >> >>
> >> >> I don't think it is something that requires a formal vote. It's not a
> >> >> question of ownership -- neither Apache nor the project PMC owns the
> >> >> code. I don't think it's different from retiring or removing any
> other
> >> >> code.
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> On Tue, Jul 21, 2015 at 7:03 PM, Mridul Muralidharan <
> mridul@gmail.com>
> >> >> wrote:
> >> >> > If I am not wrong, since the code was hosted within mesos project
> >> >> > repo, I assume (atleast part of it) is owned by mesos project and
> so
> >> >> > its PMC ?
> >> >> >
> >> >> > - Mridul
> >> >> >
> >> >> > On Tue, Jul 21, 2015 at 9:22 AM, Shivaram Venkataraman
> >> >> > <sh...@eecs.berkeley.edu> wrote:
> >> >> >> There is technically no PMC for the spark-ec2 project (I guess we
> >> >> >> are
> >> >> >> kind
> >> >> >> of establishing one right now). I haven't heard anything from the
> >> >> >> Spark
> >> >> >> PMC
> >> >> >> on the dev list that might suggest a need for a vote so far. I
> will
> >> >> >> send
> >> >> >> another round of email notification to the dev list when we have a
> >> >> >> JIRA
> >> >> >> / PR
> >> >> >> that actually moves the scripts (right now the only thing that
> >> >> >> changed
> >> >> >> is
> >> >> >> the location of some scripts in mesos/ to amplab/).
> >> >> >>
> >> >> >> Thanks
> >> >> >> Shivaram
> >> >> >>
> >> >
> >> >
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>
>