You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Jeremy Lee <un...@gmail.com> on 2014/05/30 14:08:15 UTC

Yay for 1.0.0! EC2 Still has problems.

Hi there! I'm relatively new to the list, so sorry if this is a repeat:

I just wanted to mention there are still problems with the EC2 scripts.
Basically, they don't work.

First, if you run the scripts on Amazon's own suggested version of linux,
they break because amazon installs Python2.6.9, and the scripts use a
couple of Python2.7 commands. I have to "sudo yum install python27", and
then edit the spark-ec2 shell script to use that specific version.
Annoying, but minor.

(the base "python" command isn't upgraded to 2.7 on many systems,
apparently because it would break yum)

The second minor problem is that the script doesn't know about the
"r3.large" servers... also easily fixed by adding to the spark_ec2.py
script. Minor,

The big problem is that after the EC2 cluster is provisioned, installed,
set up, and everything, it fails to start up the webserver on the master.
Here's the tail of the log:

Starting GANGLIA gmond:                                    [  OK  ]
Shutting down GANGLIA gmond:                               [FAILED]
Starting GANGLIA gmond:                                    [  OK  ]
Connection to ec2-54-183-82-48.us-west-1.compute.amazonaws.com closed.
Shutting down GANGLIA gmond:                               [FAILED]
Starting GANGLIA gmond:                                    [  OK  ]
Connection to ec2-54-183-82-24.us-west-1.compute.amazonaws.com closed.
Shutting down GANGLIA gmetad:                              [FAILED]
Starting GANGLIA gmetad:                                   [  OK  ]
Stopping httpd:                                            [FAILED]
Starting httpd: httpd: Syntax error on line 153 of
/etc/httpd/conf/httpd.conf: Cannot load modules/mod_authn_alias.so into
server: /etc/httpd/modules/mod_authn_alias.so: cannot open shared object
file: No such file or directory
                                                           [FAILED]

Basically, the AMI you have chosen does not seem to have a "full" install
of apache, and is missing several modules that are referred to in the
httpd.conf file that is installed. The full list of missing modules is:

authn_alias_module modules/mod_authn_alias.so
authn_default_module modules/mod_authn_default.so
authz_default_module modules/mod_authz_default.so
ldap_module modules/mod_ldap.so
authnz_ldap_module modules/mod_authnz_ldap.so
disk_cache_module modules/mod_disk_cache.so

Alas, even if these modules are commented out, the server still fails to
start.

root@ip-172-31-11-193 ~]$ service httpd start
Starting httpd: AH00534: httpd: Configuration error: No MPM loaded.

That means Spark 1.0.0 clusters on EC2 are Dead-On-Arrival when run
according to the instructions. Sorry.

Any suggestions on how to proceed? I'll keep trying to fix the webserver,
but (a) changes to httpd.conf get blown away by "resume", and (b) anything
I do has to be redone every time I provision another cluster. Ugh.

-- 
Jeremy Lee  BCompSci(Hons)
  The Unorthodox Engineers

Re: Yay for 1.0.0! EC2 Still has problems.

Posted by Mayur Rustagi <ma...@gmail.com>.

We are migrating our scripts to r3. Thr lineage is in spark-ec2 would be
happy to migrate those too.
Having trouble with ganglia setup currently :)
Regards
Mayur
On 31 May 2014 09:07, "Patrick Wendell" <pw...@gmail.com> wrote:

> Hi Jeremy,
>
> That's interesting, I don't think anyone has ever reported an issue
> running these scripts due to Python incompatibility, but they may require
> Python 2.7+. I regularly run them from the AWS Ubuntu 12.04 AMI... that
> might be a good place to start. But if there is a straightforward way to
> make them compatible with 2.6 we should do that.
>
> For r3.large, we can add that to the script. It's a newer type. Any
> interest in contributing this?
>
> - Patrick
>
> On May 30, 2014 5:08 AM, "Jeremy Lee" <un...@gmail.com>
> wrote:
>
>>
>> Hi there! I'm relatively new to the list, so sorry if this is a repeat:
>>
>> I just wanted to mention there are still problems with the EC2 scripts.
>> Basically, they don't work.
>>
>> First, if you run the scripts on Amazon's own suggested version of linux,
>> they break because amazon installs Python2.6.9, and the scripts use a
>> couple of Python2.7 commands. I have to "sudo yum install python27", and
>> then edit the spark-ec2 shell script to use that specific version.
>> Annoying, but minor.
>>
>> (the base "python" command isn't upgraded to 2.7 on many systems,
>> apparently because it would break yum)
>>
>> The second minor problem is that the script doesn't know about the
>> "r3.large" servers... also easily fixed by adding to the spark_ec2.py
>> script. Minor,
>>
>> The big problem is that after the EC2 cluster is provisioned, installed,
>> set up, and everything, it fails to start up the webserver on the master.
>> Here's the tail of the log:
>>
>> Starting GANGLIA gmond:                                    [  OK  ]
>> Shutting down GANGLIA gmond:                               [FAILED]
>> Starting GANGLIA gmond:                                    [  OK  ]
>> Connection to ec2-54-183-82-48.us-west-1.compute.amazonaws.com closed.
>> Shutting down GANGLIA gmond:                               [FAILED]
>> Starting GANGLIA gmond:                                    [  OK  ]
>> Connection to ec2-54-183-82-24.us-west-1.compute.amazonaws.com closed.
>> Shutting down GANGLIA gmetad:                              [FAILED]
>> Starting GANGLIA gmetad:                                   [  OK  ]
>> Stopping httpd:                                            [FAILED]
>> Starting httpd: httpd: Syntax error on line 153 of
>> /etc/httpd/conf/httpd.conf: Cannot load modules/mod_authn_alias.so into
>> server: /etc/httpd/modules/mod_authn_alias.so: cannot open shared object
>> file: No such file or directory
>>                                                            [FAILED]
>>
>> Basically, the AMI you have chosen does not seem to have a "full" install
>> of apache, and is missing several modules that are referred to in the
>> httpd.conf file that is installed. The full list of missing modules is:
>>
>> authn_alias_module modules/mod_authn_alias.so
>> authn_default_module modules/mod_authn_default.so
>> authz_default_module modules/mod_authz_default.so
>> ldap_module modules/mod_ldap.so
>> authnz_ldap_module modules/mod_authnz_ldap.so
>> disk_cache_module modules/mod_disk_cache.so
>>
>> Alas, even if these modules are commented out, the server still fails to
>> start.
>>
>> root@ip-172-31-11-193 ~]$ service httpd start
>> Starting httpd: AH00534: httpd: Configuration error: No MPM loaded.
>>
>> That means Spark 1.0.0 clusters on EC2 are Dead-On-Arrival when run
>> according to the instructions. Sorry.
>>
>> Any suggestions on how to proceed? I'll keep trying to fix the webserver,
>> but (a) changes to httpd.conf get blown away by "resume", and (b) anything
>> I do has to be redone every time I provision another cluster. Ugh.
>>
>> --
>> Jeremy Lee  BCompSci(Hons)
>>   The Unorthodox Engineers
>>
>
>
>
>
>

Re: Yay for 1.0.0! EC2 Still has problems.

Posted by Nicholas Chammas <ni...@gmail.com>.

Ah yes, looking back at the first email in the thread, indeed that was the
case. For the record, I too launch clusters from my laptop, where I have
Python 2.7 installed.


On Sun, Jun 1, 2014 at 2:01 PM, Patrick Wendell <pw...@gmail.com> wrote:

> Hey just to clarify this - my understanding is that the poster
> (Jeremey) was using a custom AMI to *launch* spark-ec2. I normally
> launch spark-ec2 from my laptop. And he was looking for an AMI that
> had a high enough version of python.
>
> Spark-ec2 itself has a flag "-a" that allows you to give a specific
> AMI. This flag is just an internal tool that we use for testing when
> we spin new AMI's. Users can't set that to an arbitrary AMI because we
> tightly control things like the Java and OS versions, libraries, etc.
>
>
> On Sun, Jun 1, 2014 at 12:51 AM, Jeremy Lee
> <un...@gmail.com> wrote:
> > *sigh* OK, I figured it out. (Thank you Nick, for the hint)
> >
> > "m1.large" works, (I swear I tested that earlier and had similar
> issues... )
> >
> > It was my obsession with starting "r3.*large" instances. Clearly I hadn't
> > patched the script in all the places.. which I think caused it to
> default to
> > the Amazon AMI. I'll have to take a closer look at the code and see if I
> > can't fix it correctly, because I really, really do want nodes with 2x
> the
> > CPU and 4x the memory for the same low spot price. :-)
> >
> > I've got a cluster up now, at least. Time for the fun stuff...
> >
> > Thanks everyone for the help!
> >
> >
> >
> > On Sun, Jun 1, 2014 at 5:19 PM, Nicholas Chammas
> > <ni...@gmail.com> wrote:
> >>
> >> If you are explicitly specifying the AMI in your invocation of
> spark-ec2,
> >> may I suggest simply removing any explicit mention of AMI from your
> >> invocation? spark-ec2 automatically selects an appropriate AMI based on
> the
> >> specified instance type.
> >>
> >> 2014년 6월 1일 일요일, Nicholas Chammas<ni...@gmail.com>님이 작성한
> 메시지:
> >>
> >>> Could you post how exactly you are invoking spark-ec2? And are you
> having
> >>> trouble just with r3 instances, or with any instance type?
> >>>
> >>> 2014년 6월 1일 일요일, Jeremy Lee<un...@gmail.com>님이 작성한 메시지:
> >>>
> >>> It's been another day of spinning up dead clusters...
> >>>
> >>> I thought I'd finally worked out what everyone else knew - don't use
> the
> >>> default AMI - but I've now run through all of the "official"
> quick-start
> >>> linux releases and I'm none the wiser:
> >>>
> >>> Amazon Linux AMI 2014.03.1 - ami-7aba833f (64-bit)
> >>> Provisions servers, connects, installs, but the webserver on the master
> >>> will not start
> >>>
> >>> Red Hat Enterprise Linux 6.5 (HVM) - ami-5cdce419
> >>> Spot instance requests are not supported for this AMI.
> >>>
> >>> SuSE Linux Enterprise Server 11 sp3 (HVM) - ami-1a88bb5f
> >>> Not tested - costs 10x more for spot instances, not economically
> viable.
> >>>
> >>> Ubuntu Server 14.04 LTS (HVM) - ami-f64f77b3
> >>> Provisions servers, but "git" is not pre-installed, so the cluster
> setup
> >>> fails.
> >>>
> >>> Amazon Linux AMI (HVM) 2014.03.1 - ami-5aba831f
> >>> Provisions servers, but "git" is not pre-installed, so the cluster
> setup
> >>> fails.
> >
> >
> >
> >
> > --
> > Jeremy Lee  BCompSci(Hons)
> >   The Unorthodox Engineers
>

Re: Yay for 1.0.0! EC2 Still has problems.

Posted by Nicholas Chammas <ni...@gmail.com>.

On Wed, Jun 4, 2014 at 9:35 AM, Jeremy Lee <un...@gmail.com>
wrote:

> Oh, I went back to m1.large while those issues get sorted out.


Random side note, Amazon is deprecating the m1 instances in favor of m3
instances, which have SSDs and more ECUs than their m1 counterparts.
m3.2xlarge has 30GB of RAM and may be a good-enough substitution for the r3
instances for you for the time being.

Nick

Re: Yay for 1.0.0! EC2 Still has problems.

Posted by Jeremy Lee <un...@gmail.com>.

On Wed, Jun 4, 2014 at 12:31 PM, Matei Zaharia <ma...@gmail.com>
wrote:

> Ah, sorry to hear you had more problems. Some thoughts on them:
>

There will always be more problems, 'tis the nature of coding. :-) I try
not to bother the list until I've smacked my head against them for a few
hours, so it's only the "most confusing" stuff I pour out here. I'm
actually progressing pretty well.

> (the streaming.Twitter ones especially) depend on there being a
> "/mnt/spark" and "/mnt2/spark" directory (I think for java tempfiles?) and
> those don't seem to exist out-of-the-box.
>
> I think this is a side-effect of the r3 instances not having those drives
> mounted. Our setup script would normally create these directories. What was
> the error?
>
>
Oh, I went back to m1.large while those issues get sorted out. I decided I
had enough problems without messing with that too. (seriously, why does
Amazon do these things? It's like they _try_ to make the instances
incompatible.)

I forget the exact error, but it traced through createTempFile and it was
fairly clear about the directory being missing. Things like
"bin/run-example SparkPi" worked fine, but I'll bet twitter4j creates temp
files, so "bin/run-example streaming.TwitterPopularTags" broke.

What did you change log4j.properties to? It should be changed to say
> log4j.rootCategory=WARN, console but maybe another log4j.properties is
> somehow arriving on the classpath. This is definitely a common problem so
> we need to add some explicit docs on it.
>

I seem to have this sorted out, don't ask me how. Once again I was probably
editing things on the cluster master when I should have been editing the
cluster controller, or vice versa. But, yeah, many of the examples just get
lost in a sea of DAG INFO messages.

> Are you going through http://spark.apache.org/docs/latest/quick-start.html?
> You should be able to do just sbt package. Once you do that you don’t need
> to deploy your application’s JAR to the cluster, just pass it to
> spark-submit and it will automatically be sent over.
>

Ah, that answers another question I just asked elsewhere... Yup, I re-read
pretty much every documentation page daily. And I'm making my way through
every video.

> > Meanwhile I'm learning scala... Great Turing's Ghost, it's the dream
> language we've theorized about for years! I hadn't realized!
>
> Indeed, glad you’re enjoying it.
>

"Enjoying", not yet alas, I'm sure I'll get there. But I do understand the
implications of a mixed functional-imperative language with closures and
lambdas. That is serious voodoo.

-- 
Jeremy Lee  BCompSci(Hons)
  The Unorthodox Engineers

Re: Yay for 1.0.0! EC2 Still has problems.

Posted by Matei Zaharia <ma...@gmail.com>.

Ah, sorry to hear you had more problems. Some thoughts on them:

> Thanks for that, Matei! I'll look at that once I get a spare moment. :-)
> 
> If you like, I'll keep documenting my newbie problems and frustrations... perhaps it might make things easier for others.
> 
> Another issue I seem to have found (now that I can get small clusters up): some of the examples (the streaming.Twitter ones especially) depend on there being a "/mnt/spark" and "/mnt2/spark" directory (I think for java tempfiles?) and those don't seem to exist out-of-the-box. I have to create those directories and use "copy-dir" to get them to the workers before those examples run.

I think this is a side-effect of the r3 instances not having those drives mounted. Our setup script would normally create these directories. What was the error?

> Much of the the last two days for me have been about failing to get any of my own code to work, except for in spark-shell. (which is very nice, btw)
> 
> At first I tried editing the examples, because I took the documentation literally when it said "Finally, Spark includes several samples in the examples directory (Scala, Java, Python). You can run them as follows:"  but of course didn't realize editing them is pointless because while the source is there, the code is actually pulled from a .jar elsewhere. Doh. (so obvious in hindsight)
> 
> I couldn't even turn down the voluminous INFO messages to WARNs, no matter how many conf/log4j.properties files I edited or copy-dir'd. I'm sure there's a trick to that I'm not getting. 

What did you change log4j.properties to? It should be changed to say log4j.rootCategory=WARN, console but maybe another log4j.properties is somehow arriving on the classpath. This is definitely a common problem so we need to add some explicit docs on it.

> Even trying to build SimpleApp I've run into the problem that all the documentation says to use "sbt/sbt assemble", but sbt doesn't seem to be in the 1.0.0 pre-built packages that I downloaded.

Are you going through http://spark.apache.org/docs/latest/quick-start.html? You should be able to do just sbt package. Once you do that you don’t need to deploy your application’s JAR to the cluster, just pass it to spark-submit and it will automatically be sent over.

> Ah... yes.. there it is in the source package. I suppose that means that in order to deploy any new code to the cluster, I've got to rebuild from source on my "cluster controller". OK, I never liked that Amazon Linux AMI anyway. I'm going to start from scratch again with an Ubuntu 12.04 instance, hopefully that will be more auspicious...
> 
> Meanwhile I'm learning scala... Great Turing's Ghost, it's the dream language we've theorized about for years! I hadn't realized!

Indeed, glad you’re enjoying it.

Matei

> 
> 
> 
> On Mon, Jun 2, 2014 at 12:05 PM, Matei Zaharia <ma...@gmail.com> wrote:
> FYI, I opened https://issues.apache.org/jira/browse/SPARK-1990 to track this. 
> Matei
> 
> 
> On Jun 1, 2014, at 6:14 PM, Jeremy Lee <un...@gmail.com> wrote:
> 
>> Sort of.. there were two separate issues, but both related to AWS..
>> 
>> I've sorted the confusion about the Master/Worker AMI ... use the version chosen by the scripts. (and use the right instance type so the script can choose wisely)
>> 
>> But yes, one also needs a "launch machine" to kick off the cluster, and for that I _also_ was using an Amazon instance... (made sense.. I have a team that will needs to do things as well, not just me) and I was just pointing out that if you use the "most recommended by Amazon" AMI (for your free micro instance, for example) you get python 2.6 and the ec2 scripts fail.
>> 
>> That merely needs a line in the documentation saying "use Ubuntu for your cluster controller, not Amazon Linux" or somesuch. But yeah, for a newbie, it was hard working out when to use "default" or "custom" AMIs for various parts of the setup.
>> 
>> 
>> On Mon, Jun 2, 2014 at 4:01 AM, Patrick Wendell <pw...@gmail.com> wrote:
>> Hey just to clarify this - my understanding is that the poster
>> (Jeremey) was using a custom AMI to *launch* spark-ec2. I normally
>> launch spark-ec2 from my laptop. And he was looking for an AMI that
>> had a high enough version of python.
>> 
>> Spark-ec2 itself has a flag "-a" that allows you to give a specific
>> AMI. This flag is just an internal tool that we use for testing when
>> we spin new AMI's. Users can't set that to an arbitrary AMI because we
>> tightly control things like the Java and OS versions, libraries, etc.
>> 
>> 
>> On Sun, Jun 1, 2014 at 12:51 AM, Jeremy Lee
>> <un...@gmail.com> wrote:
>> > *sigh* OK, I figured it out. (Thank you Nick, for the hint)
>> >
>> > "m1.large" works, (I swear I tested that earlier and had similar issues... )
>> >
>> > It was my obsession with starting "r3.*large" instances. Clearly I hadn't
>> > patched the script in all the places.. which I think caused it to default to
>> > the Amazon AMI. I'll have to take a closer look at the code and see if I
>> > can't fix it correctly, because I really, really do want nodes with 2x the
>> > CPU and 4x the memory for the same low spot price. :-)
>> >
>> > I've got a cluster up now, at least. Time for the fun stuff...
>> >
>> > Thanks everyone for the help!
>> >
>> >
>> >
>> > On Sun, Jun 1, 2014 at 5:19 PM, Nicholas Chammas
>> > <ni...@gmail.com> wrote:
>> >>
>> >> If you are explicitly specifying the AMI in your invocation of spark-ec2,
>> >> may I suggest simply removing any explicit mention of AMI from your
>> >> invocation? spark-ec2 automatically selects an appropriate AMI based on the
>> >> specified instance type.
>> >>
>> >> 2014년 6월 1일 일요일, Nicholas Chammas<ni...@gmail.com>님이 작성한 메시지:
>> >>
>> >>> Could you post how exactly you are invoking spark-ec2? And are you having
>> >>> trouble just with r3 instances, or with any instance type?
>> >>>
>> >>> 2014년 6월 1일 일요일, Jeremy Lee<un...@gmail.com>님이 작성한 메시지:
>> >>>
>> >>> It's been another day of spinning up dead clusters...
>> >>>
>> >>> I thought I'd finally worked out what everyone else knew - don't use the
>> >>> default AMI - but I've now run through all of the "official" quick-start
>> >>> linux releases and I'm none the wiser:
>> >>>
>> >>> Amazon Linux AMI 2014.03.1 - ami-7aba833f (64-bit)
>> >>> Provisions servers, connects, installs, but the webserver on the master
>> >>> will not start
>> >>>
>> >>> Red Hat Enterprise Linux 6.5 (HVM) - ami-5cdce419
>> >>> Spot instance requests are not supported for this AMI.
>> >>>
>> >>> SuSE Linux Enterprise Server 11 sp3 (HVM) - ami-1a88bb5f
>> >>> Not tested - costs 10x more for spot instances, not economically viable.
>> >>>
>> >>> Ubuntu Server 14.04 LTS (HVM) - ami-f64f77b3
>> >>> Provisions servers, but "git" is not pre-installed, so the cluster setup
>> >>> fails.
>> >>>
>> >>> Amazon Linux AMI (HVM) 2014.03.1 - ami-5aba831f
>> >>> Provisions servers, but "git" is not pre-installed, so the cluster setup
>> >>> fails.
>> >
>> >
>> >
>> >
>> > --
>> > Jeremy Lee  BCompSci(Hons)
>> >   The Unorthodox Engineers
>> 
>> 
>> 
>> -- 
>> Jeremy Lee  BCompSci(Hons)
>>   The Unorthodox Engineers
> 
> 
> 
> 
> -- 
> Jeremy Lee  BCompSci(Hons)
>   The Unorthodox Engineers

Re: Yay for 1.0.0! EC2 Still has problems.

Posted by Jeremy Lee <un...@gmail.com>.

Thanks for that, Matei! I'll look at that once I get a spare moment. :-)

If you like, I'll keep documenting my newbie problems and frustrations...
perhaps it might make things easier for others.

Another issue I seem to have found (now that I can get small clusters up):
some of the examples (the streaming.Twitter ones especially) depend on
there being a "/mnt/spark" and "/mnt2/spark" directory (I think for java
tempfiles?) and those don't seem to exist out-of-the-box. I have to create
those directories and use "copy-dir" to get them to the workers before
those examples run.

Much of the the last two days for me have been about failing to get any of
my own code to work, except for in spark-shell. (which is very nice, btw)

At first I tried editing the examples, because I took the documentation
literally when it said "Finally, Spark includes several samples in the
examples directory (Scala, Java, Python). You can run them as follows:"
 but of course didn't realize editing them is pointless because while the
source is there, the code is actually pulled from a .jar elsewhere. Doh.
(so obvious in hindsight)

I couldn't even turn down the voluminous INFO messages to WARNs, no matter
how many conf/log4j.properties files I edited or copy-dir'd. I'm sure
there's a trick to that I'm not getting.

Even trying to build SimpleApp I've run into the problem that all the
documentation says to use "sbt/sbt assemble", but sbt doesn't seem to be in
the 1.0.0 pre-built packages that I downloaded.

Ah... yes.. there it is in the source package. I suppose that means that in
order to deploy any new code to the cluster, I've got to rebuild from
source on my "cluster controller". OK, I never liked that Amazon Linux AMI
anyway. I'm going to start from scratch again with an Ubuntu 12.04
instance, hopefully that will be more auspicious...

Meanwhile I'm learning scala... Great Turing's Ghost, it's the dream
language we've theorized about for years! I hadn't realized!

On Mon, Jun 2, 2014 at 12:05 PM, Matei Zaharia <ma...@gmail.com>
wrote:

> FYI, I opened https://issues.apache.org/jira/browse/SPARK-1990 to track
> this.
>
> Matei
>
>
> On Jun 1, 2014, at 6:14 PM, Jeremy Lee <un...@gmail.com>
> wrote:
>
> Sort of.. there were two separate issues, but both related to AWS..
>
> I've sorted the confusion about the Master/Worker AMI ... use the version
> chosen by the scripts. (and use the right instance type so the script can
> choose wisely)
>
> But yes, one also needs a "launch machine" to kick off the cluster, and
> for that I _also_ was using an Amazon instance... (made sense.. I have a
> team that will needs to do things as well, not just me) and I was just
> pointing out that if you use the "most recommended by Amazon" AMI (for your
> free micro instance, for example) you get python 2.6 and the ec2 scripts
> fail.
>
> That merely needs a line in the documentation saying "use Ubuntu for your
> cluster controller, not Amazon Linux" or somesuch. But yeah, for a newbie,
> it was hard working out when to use "default" or "custom" AMIs for various
> parts of the setup.
>
>
> On Mon, Jun 2, 2014 at 4:01 AM, Patrick Wendell <pw...@gmail.com>
> wrote:
>
>> Hey just to clarify this - my understanding is that the poster
>> (Jeremey) was using a custom AMI to *launch* spark-ec2. I normally
>> launch spark-ec2 from my laptop. And he was looking for an AMI that
>> had a high enough version of python.
>>
>> Spark-ec2 itself has a flag "-a" that allows you to give a specific
>> AMI. This flag is just an internal tool that we use for testing when
>> we spin new AMI's. Users can't set that to an arbitrary AMI because we
>> tightly control things like the Java and OS versions, libraries, etc.
>>
>>
>> On Sun, Jun 1, 2014 at 12:51 AM, Jeremy Lee
>> <un...@gmail.com> wrote:
>> > *sigh* OK, I figured it out. (Thank you Nick, for the hint)
>> >
>> > "m1.large" works, (I swear I tested that earlier and had similar
>> issues... )
>> >
>> > It was my obsession with starting "r3.*large" instances. Clearly I
>> hadn't
>> > patched the script in all the places.. which I think caused it to
>> default to
>> > the Amazon AMI. I'll have to take a closer look at the code and see if I
>> > can't fix it correctly, because I really, really do want nodes with 2x
>> the
>> > CPU and 4x the memory for the same low spot price. :-)
>> >
>> > I've got a cluster up now, at least. Time for the fun stuff...
>> >
>> > Thanks everyone for the help!
>> >
>> >
>> >
>> > On Sun, Jun 1, 2014 at 5:19 PM, Nicholas Chammas
>> > <ni...@gmail.com> wrote:
>> >>
>> >> If you are explicitly specifying the AMI in your invocation of
>> spark-ec2,
>> >> may I suggest simply removing any explicit mention of AMI from your
>> >> invocation? spark-ec2 automatically selects an appropriate AMI based
>> on the
>> >> specified instance type.
>> >>
>> >> 2014년 6월 1일 일요일, Nicholas Chammas<ni...@gmail.com>님이 작성한
>> 메시지:
>> >>
>> >>> Could you post how exactly you are invoking spark-ec2? And are you
>> having
>> >>> trouble just with r3 instances, or with any instance type?
>> >>>
>> >>> 2014년 6월 1일 일요일, Jeremy Lee<un...@gmail.com>님이 작성한
>> 메시지:
>> >>>
>> >>> It's been another day of spinning up dead clusters...
>> >>>
>> >>> I thought I'd finally worked out what everyone else knew - don't use
>> the
>> >>> default AMI - but I've now run through all of the "official"
>> quick-start
>> >>> linux releases and I'm none the wiser:
>> >>>
>> >>> Amazon Linux AMI 2014.03.1 - ami-7aba833f (64-bit)
>> >>> Provisions servers, connects, installs, but the webserver on the
>> master
>> >>> will not start
>> >>>
>> >>> Red Hat Enterprise Linux 6.5 (HVM) - ami-5cdce419
>> >>> Spot instance requests are not supported for this AMI.
>> >>>
>> >>> SuSE Linux Enterprise Server 11 sp3 (HVM) - ami-1a88bb5f
>> >>> Not tested - costs 10x more for spot instances, not economically
>> viable.
>> >>>
>> >>> Ubuntu Server 14.04 LTS (HVM) - ami-f64f77b3
>> >>> Provisions servers, but "git" is not pre-installed, so the cluster
>> setup
>> >>> fails.
>> >>>
>> >>> Amazon Linux AMI (HVM) 2014.03.1 - ami-5aba831f
>> >>> Provisions servers, but "git" is not pre-installed, so the cluster
>> setup
>> >>> fails.
>> >
>> >
>> >
>> >
>> > --
>> > Jeremy Lee  BCompSci(Hons)
>> >   The Unorthodox Engineers
>>
>
>
>
> --
> Jeremy Lee  BCompSci(Hons)
>   The Unorthodox Engineers
>
>
>

-- 
Jeremy Lee  BCompSci(Hons)
  The Unorthodox Engineers

Re: Yay for 1.0.0! EC2 Still has problems.

Posted by Matei Zaharia <ma...@gmail.com>.

FYI, I opened https://issues.apache.org/jira/browse/SPARK-1990 to track this.

Matei

On Jun 1, 2014, at 6:14 PM, Jeremy Lee <un...@gmail.com> wrote:

> Sort of.. there were two separate issues, but both related to AWS..
> 
> I've sorted the confusion about the Master/Worker AMI ... use the version chosen by the scripts. (and use the right instance type so the script can choose wisely)
> 
> But yes, one also needs a "launch machine" to kick off the cluster, and for that I _also_ was using an Amazon instance... (made sense.. I have a team that will needs to do things as well, not just me) and I was just pointing out that if you use the "most recommended by Amazon" AMI (for your free micro instance, for example) you get python 2.6 and the ec2 scripts fail.
> 
> That merely needs a line in the documentation saying "use Ubuntu for your cluster controller, not Amazon Linux" or somesuch. But yeah, for a newbie, it was hard working out when to use "default" or "custom" AMIs for various parts of the setup.
> 
> 
> On Mon, Jun 2, 2014 at 4:01 AM, Patrick Wendell <pw...@gmail.com> wrote:
> Hey just to clarify this - my understanding is that the poster
> (Jeremey) was using a custom AMI to *launch* spark-ec2. I normally
> launch spark-ec2 from my laptop. And he was looking for an AMI that
> had a high enough version of python.
> 
> Spark-ec2 itself has a flag "-a" that allows you to give a specific
> AMI. This flag is just an internal tool that we use for testing when
> we spin new AMI's. Users can't set that to an arbitrary AMI because we
> tightly control things like the Java and OS versions, libraries, etc.
> 
> 
> On Sun, Jun 1, 2014 at 12:51 AM, Jeremy Lee
> <un...@gmail.com> wrote:
> > *sigh* OK, I figured it out. (Thank you Nick, for the hint)
> >
> > "m1.large" works, (I swear I tested that earlier and had similar issues... )
> >
> > It was my obsession with starting "r3.*large" instances. Clearly I hadn't
> > patched the script in all the places.. which I think caused it to default to
> > the Amazon AMI. I'll have to take a closer look at the code and see if I
> > can't fix it correctly, because I really, really do want nodes with 2x the
> > CPU and 4x the memory for the same low spot price. :-)
> >
> > I've got a cluster up now, at least. Time for the fun stuff...
> >
> > Thanks everyone for the help!
> >
> >
> >
> > On Sun, Jun 1, 2014 at 5:19 PM, Nicholas Chammas
> > <ni...@gmail.com> wrote:
> >>
> >> If you are explicitly specifying the AMI in your invocation of spark-ec2,
> >> may I suggest simply removing any explicit mention of AMI from your
> >> invocation? spark-ec2 automatically selects an appropriate AMI based on the
> >> specified instance type.
> >>
> >> 2014년 6월 1일 일요일, Nicholas Chammas<ni...@gmail.com>님이 작성한 메시지:
> >>
> >>> Could you post how exactly you are invoking spark-ec2? And are you having
> >>> trouble just with r3 instances, or with any instance type?
> >>>
> >>> 2014년 6월 1일 일요일, Jeremy Lee<un...@gmail.com>님이 작성한 메시지:
> >>>
> >>> It's been another day of spinning up dead clusters...
> >>>
> >>> I thought I'd finally worked out what everyone else knew - don't use the
> >>> default AMI - but I've now run through all of the "official" quick-start
> >>> linux releases and I'm none the wiser:
> >>>
> >>> Amazon Linux AMI 2014.03.1 - ami-7aba833f (64-bit)
> >>> Provisions servers, connects, installs, but the webserver on the master
> >>> will not start
> >>>
> >>> Red Hat Enterprise Linux 6.5 (HVM) - ami-5cdce419
> >>> Spot instance requests are not supported for this AMI.
> >>>
> >>> SuSE Linux Enterprise Server 11 sp3 (HVM) - ami-1a88bb5f
> >>> Not tested - costs 10x more for spot instances, not economically viable.
> >>>
> >>> Ubuntu Server 14.04 LTS (HVM) - ami-f64f77b3
> >>> Provisions servers, but "git" is not pre-installed, so the cluster setup
> >>> fails.
> >>>
> >>> Amazon Linux AMI (HVM) 2014.03.1 - ami-5aba831f
> >>> Provisions servers, but "git" is not pre-installed, so the cluster setup
> >>> fails.
> >
> >
> >
> >
> > --
> > Jeremy Lee  BCompSci(Hons)
> >   The Unorthodox Engineers
> 
> 
> 
> -- 
> Jeremy Lee  BCompSci(Hons)
>   The Unorthodox Engineers

Re: Yay for 1.0.0! EC2 Still has problems.

Posted by Jeremy Lee <un...@gmail.com>.

Sort of.. there were two separate issues, but both related to AWS..

I've sorted the confusion about the Master/Worker AMI ... use the version
chosen by the scripts. (and use the right instance type so the script can
choose wisely)

But yes, one also needs a "launch machine" to kick off the cluster, and for
that I _also_ was using an Amazon instance... (made sense.. I have a team
that will needs to do things as well, not just me) and I was just pointing
out that if you use the "most recommended by Amazon" AMI (for your free
micro instance, for example) you get python 2.6 and the ec2 scripts fail.

That merely needs a line in the documentation saying "use Ubuntu for your
cluster controller, not Amazon Linux" or somesuch. But yeah, for a newbie,
it was hard working out when to use "default" or "custom" AMIs for various
parts of the setup.


On Mon, Jun 2, 2014 at 4:01 AM, Patrick Wendell <pw...@gmail.com> wrote:

> Hey just to clarify this - my understanding is that the poster
> (Jeremey) was using a custom AMI to *launch* spark-ec2. I normally
> launch spark-ec2 from my laptop. And he was looking for an AMI that
> had a high enough version of python.
>
> Spark-ec2 itself has a flag "-a" that allows you to give a specific
> AMI. This flag is just an internal tool that we use for testing when
> we spin new AMI's. Users can't set that to an arbitrary AMI because we
> tightly control things like the Java and OS versions, libraries, etc.
>
>
> On Sun, Jun 1, 2014 at 12:51 AM, Jeremy Lee
> <un...@gmail.com> wrote:
> > *sigh* OK, I figured it out. (Thank you Nick, for the hint)
> >
> > "m1.large" works, (I swear I tested that earlier and had similar
> issues... )
> >
> > It was my obsession with starting "r3.*large" instances. Clearly I hadn't
> > patched the script in all the places.. which I think caused it to
> default to
> > the Amazon AMI. I'll have to take a closer look at the code and see if I
> > can't fix it correctly, because I really, really do want nodes with 2x
> the
> > CPU and 4x the memory for the same low spot price. :-)
> >
> > I've got a cluster up now, at least. Time for the fun stuff...
> >
> > Thanks everyone for the help!
> >
> >
> >
> > On Sun, Jun 1, 2014 at 5:19 PM, Nicholas Chammas
> > <ni...@gmail.com> wrote:
> >>
> >> If you are explicitly specifying the AMI in your invocation of
> spark-ec2,
> >> may I suggest simply removing any explicit mention of AMI from your
> >> invocation? spark-ec2 automatically selects an appropriate AMI based on
> the
> >> specified instance type.
> >>
> >> 2014년 6월 1일 일요일, Nicholas Chammas<ni...@gmail.com>님이 작성한
> 메시지:
> >>
> >>> Could you post how exactly you are invoking spark-ec2? And are you
> having
> >>> trouble just with r3 instances, or with any instance type?
> >>>
> >>> 2014년 6월 1일 일요일, Jeremy Lee<un...@gmail.com>님이 작성한 메시지:
> >>>
> >>> It's been another day of spinning up dead clusters...
> >>>
> >>> I thought I'd finally worked out what everyone else knew - don't use
> the
> >>> default AMI - but I've now run through all of the "official"
> quick-start
> >>> linux releases and I'm none the wiser:
> >>>
> >>> Amazon Linux AMI 2014.03.1 - ami-7aba833f (64-bit)
> >>> Provisions servers, connects, installs, but the webserver on the master
> >>> will not start
> >>>
> >>> Red Hat Enterprise Linux 6.5 (HVM) - ami-5cdce419
> >>> Spot instance requests are not supported for this AMI.
> >>>
> >>> SuSE Linux Enterprise Server 11 sp3 (HVM) - ami-1a88bb5f
> >>> Not tested - costs 10x more for spot instances, not economically
> viable.
> >>>
> >>> Ubuntu Server 14.04 LTS (HVM) - ami-f64f77b3
> >>> Provisions servers, but "git" is not pre-installed, so the cluster
> setup
> >>> fails.
> >>>
> >>> Amazon Linux AMI (HVM) 2014.03.1 - ami-5aba831f
> >>> Provisions servers, but "git" is not pre-installed, so the cluster
> setup
> >>> fails.
> >
> >
> >
> >
> > --
> > Jeremy Lee  BCompSci(Hons)
> >   The Unorthodox Engineers
>



-- 
Jeremy Lee  BCompSci(Hons)
  The Unorthodox Engineers

Re: Yay for 1.0.0! EC2 Still has problems.

Posted by Matei Zaharia <ma...@gmail.com>.

More specifically with the -a flag, you *can* set your own AMI, but you’ll need to base it off ours. This is because spark-ec2 assumes that some packages (e.g. java, Python 2.6) are already available on the AMI.

Matei

On Jun 1, 2014, at 11:01 AM, Patrick Wendell <pw...@gmail.com> wrote:

> Hey just to clarify this - my understanding is that the poster
> (Jeremey) was using a custom AMI to *launch* spark-ec2. I normally
> launch spark-ec2 from my laptop. And he was looking for an AMI that
> had a high enough version of python.
> 
> Spark-ec2 itself has a flag "-a" that allows you to give a specific
> AMI. This flag is just an internal tool that we use for testing when
> we spin new AMI's. Users can't set that to an arbitrary AMI because we
> tightly control things like the Java and OS versions, libraries, etc.
> 
> 
> On Sun, Jun 1, 2014 at 12:51 AM, Jeremy Lee
> <un...@gmail.com> wrote:
>> *sigh* OK, I figured it out. (Thank you Nick, for the hint)
>> 
>> "m1.large" works, (I swear I tested that earlier and had similar issues... )
>> 
>> It was my obsession with starting "r3.*large" instances. Clearly I hadn't
>> patched the script in all the places.. which I think caused it to default to
>> the Amazon AMI. I'll have to take a closer look at the code and see if I
>> can't fix it correctly, because I really, really do want nodes with 2x the
>> CPU and 4x the memory for the same low spot price. :-)
>> 
>> I've got a cluster up now, at least. Time for the fun stuff...
>> 
>> Thanks everyone for the help!
>> 
>> 
>> 
>> On Sun, Jun 1, 2014 at 5:19 PM, Nicholas Chammas
>> <ni...@gmail.com> wrote:
>>> 
>>> If you are explicitly specifying the AMI in your invocation of spark-ec2,
>>> may I suggest simply removing any explicit mention of AMI from your
>>> invocation? spark-ec2 automatically selects an appropriate AMI based on the
>>> specified instance type.
>>> 
>>> 2014년 6월 1일 일요일, Nicholas Chammas<ni...@gmail.com>님이 작성한 메시지:
>>> 
>>>> Could you post how exactly you are invoking spark-ec2? And are you having
>>>> trouble just with r3 instances, or with any instance type?
>>>> 
>>>> 2014년 6월 1일 일요일, Jeremy Lee<un...@gmail.com>님이 작성한 메시지:
>>>> 
>>>> It's been another day of spinning up dead clusters...
>>>> 
>>>> I thought I'd finally worked out what everyone else knew - don't use the
>>>> default AMI - but I've now run through all of the "official" quick-start
>>>> linux releases and I'm none the wiser:
>>>> 
>>>> Amazon Linux AMI 2014.03.1 - ami-7aba833f (64-bit)
>>>> Provisions servers, connects, installs, but the webserver on the master
>>>> will not start
>>>> 
>>>> Red Hat Enterprise Linux 6.5 (HVM) - ami-5cdce419
>>>> Spot instance requests are not supported for this AMI.
>>>> 
>>>> SuSE Linux Enterprise Server 11 sp3 (HVM) - ami-1a88bb5f
>>>> Not tested - costs 10x more for spot instances, not economically viable.
>>>> 
>>>> Ubuntu Server 14.04 LTS (HVM) - ami-f64f77b3
>>>> Provisions servers, but "git" is not pre-installed, so the cluster setup
>>>> fails.
>>>> 
>>>> Amazon Linux AMI (HVM) 2014.03.1 - ami-5aba831f
>>>> Provisions servers, but "git" is not pre-installed, so the cluster setup
>>>> fails.
>> 
>> 
>> 
>> 
>> --
>> Jeremy Lee  BCompSci(Hons)
>>  The Unorthodox Engineers

Re: Yay for 1.0.0! EC2 Still has problems.

Posted by Patrick Wendell <pw...@gmail.com>.

Hey just to clarify this - my understanding is that the poster
(Jeremey) was using a custom AMI to *launch* spark-ec2. I normally
launch spark-ec2 from my laptop. And he was looking for an AMI that
had a high enough version of python.

Spark-ec2 itself has a flag "-a" that allows you to give a specific
AMI. This flag is just an internal tool that we use for testing when
we spin new AMI's. Users can't set that to an arbitrary AMI because we
tightly control things like the Java and OS versions, libraries, etc.


On Sun, Jun 1, 2014 at 12:51 AM, Jeremy Lee
<un...@gmail.com> wrote:
> *sigh* OK, I figured it out. (Thank you Nick, for the hint)
>
> "m1.large" works, (I swear I tested that earlier and had similar issues... )
>
> It was my obsession with starting "r3.*large" instances. Clearly I hadn't
> patched the script in all the places.. which I think caused it to default to
> the Amazon AMI. I'll have to take a closer look at the code and see if I
> can't fix it correctly, because I really, really do want nodes with 2x the
> CPU and 4x the memory for the same low spot price. :-)
>
> I've got a cluster up now, at least. Time for the fun stuff...
>
> Thanks everyone for the help!
>
>
>
> On Sun, Jun 1, 2014 at 5:19 PM, Nicholas Chammas
> <ni...@gmail.com> wrote:
>>
>> If you are explicitly specifying the AMI in your invocation of spark-ec2,
>> may I suggest simply removing any explicit mention of AMI from your
>> invocation? spark-ec2 automatically selects an appropriate AMI based on the
>> specified instance type.
>>
>> 2014년 6월 1일 일요일, Nicholas Chammas<ni...@gmail.com>님이 작성한 메시지:
>>
>>> Could you post how exactly you are invoking spark-ec2? And are you having
>>> trouble just with r3 instances, or with any instance type?
>>>
>>> 2014년 6월 1일 일요일, Jeremy Lee<un...@gmail.com>님이 작성한 메시지:
>>>
>>> It's been another day of spinning up dead clusters...
>>>
>>> I thought I'd finally worked out what everyone else knew - don't use the
>>> default AMI - but I've now run through all of the "official" quick-start
>>> linux releases and I'm none the wiser:
>>>
>>> Amazon Linux AMI 2014.03.1 - ami-7aba833f (64-bit)
>>> Provisions servers, connects, installs, but the webserver on the master
>>> will not start
>>>
>>> Red Hat Enterprise Linux 6.5 (HVM) - ami-5cdce419
>>> Spot instance requests are not supported for this AMI.
>>>
>>> SuSE Linux Enterprise Server 11 sp3 (HVM) - ami-1a88bb5f
>>> Not tested - costs 10x more for spot instances, not economically viable.
>>>
>>> Ubuntu Server 14.04 LTS (HVM) - ami-f64f77b3
>>> Provisions servers, but "git" is not pre-installed, so the cluster setup
>>> fails.
>>>
>>> Amazon Linux AMI (HVM) 2014.03.1 - ami-5aba831f
>>> Provisions servers, but "git" is not pre-installed, so the cluster setup
>>> fails.
>
>
>
>
> --
> Jeremy Lee  BCompSci(Hons)
>   The Unorthodox Engineers

Re: Yay for 1.0.0! EC2 Still has problems.

Posted by Jeremy Lee <un...@gmail.com>.

*sigh* OK, I figured it out. (Thank you Nick, for the hint)

"m1.large" works, (I swear I tested that earlier and had similar issues...
)

It was my obsession with starting "r3.*large" instances. Clearly I hadn't
patched the script in all the places.. which I think caused it to default
to the Amazon AMI. I'll have to take a closer look at the code and see if I
can't fix it correctly, because I really, really do want nodes with 2x the
CPU and 4x the memory for the same low spot price. :-)

I've got a cluster up now, at least. Time for the fun stuff...

Thanks everyone for the help!



On Sun, Jun 1, 2014 at 5:19 PM, Nicholas Chammas <nicholas.chammas@gmail.com
> wrote:

> If you are explicitly specifying the AMI in your invocation of spark-ec2,
> may I suggest simply removing any explicit mention of AMI from your
> invocation? spark-ec2 automatically selects an appropriate AMI based on
> the specified instance type.
>
> 2014년 6월 1일 일요일, Nicholas Chammas<ni...@gmail.com>님이 작성한 메시지:
>
> Could you post how exactly you are invoking spark-ec2? And are you having
>> trouble just with r3 instances, or with any instance type?
>>
>> 2014년 6월 1일 일요일, Jeremy Lee<un...@gmail.com>님이 작성한 메시지:
>>
>> It's been another day of spinning up dead clusters...
>>
>> I thought I'd finally worked out what everyone else knew - don't use the
>> default AMI - but I've now run through all of the "official" quick-start
>> linux releases and I'm none the wiser:
>>
>> Amazon Linux AMI 2014.03.1 - ami-7aba833f (64-bit)
>> Provisions servers, connects, installs, but the webserver on the master
>> will not start
>>
>> Red Hat Enterprise Linux 6.5 (HVM) - ami-5cdce419
>> Spot instance requests are not supported for this AMI.
>>
>> SuSE Linux Enterprise Server 11 sp3 (HVM) - ami-1a88bb5f
>> Not tested - costs 10x more for spot instances, not economically viable.
>>
>> Ubuntu Server 14.04 LTS (HVM) - ami-f64f77b3
>> Provisions servers, but "git" is not pre-installed, so the cluster setup
>> fails.
>>
>> Amazon Linux AMI (HVM) 2014.03.1 - ami-5aba831f
>> Provisions servers, but "git" is not pre-installed, so the cluster setup
>> fails.
>>
>>


-- 
Jeremy Lee  BCompSci(Hons)
  The Unorthodox Engineers

Re: Yay for 1.0.0! EC2 Still has problems.

Posted by Nicholas Chammas <ni...@gmail.com>.

If you are explicitly specifying the AMI in your invocation of spark-ec2,
may I suggest simply removing any explicit mention of AMI from your
invocation? spark-ec2 automatically selects an appropriate AMI based on the
specified instance type.

2014년 6월 1일 일요일, Nicholas Chammas<ni...@gmail.com>님이 작성한 메시지:

> Could you post how exactly you are invoking spark-ec2? And are you having
> trouble just with r3 instances, or with any instance type?
>
> 2014년 6월 1일 일요일, Jeremy Lee<unorthodox.engineers@gmail.com
> <javascript:_e(%7B%7D,'cvml','unorthodox.engineers@gmail.com');>>님이 작성한
> 메시지:
>
> It's been another day of spinning up dead clusters...
>
> I thought I'd finally worked out what everyone else knew - don't use the
> default AMI - but I've now run through all of the "official" quick-start
> linux releases and I'm none the wiser:
>
> Amazon Linux AMI 2014.03.1 - ami-7aba833f (64-bit)
> Provisions servers, connects, installs, but the webserver on the master
> will not start
>
> Red Hat Enterprise Linux 6.5 (HVM) - ami-5cdce419
> Spot instance requests are not supported for this AMI.
>
> SuSE Linux Enterprise Server 11 sp3 (HVM) - ami-1a88bb5f
> Not tested - costs 10x more for spot instances, not economically viable.
>
> Ubuntu Server 14.04 LTS (HVM) - ami-f64f77b3
> Provisions servers, but "git" is not pre-installed, so the cluster setup
> fails.
>
> Amazon Linux AMI (HVM) 2014.03.1 - ami-5aba831f
> Provisions servers, but "git" is not pre-installed, so the cluster setup
> fails.
>
>

Re: Yay for 1.0.0! EC2 Still has problems.

Posted by Nicholas Chammas <ni...@gmail.com>.

Could you post how exactly you are invoking spark-ec2? And are you having
trouble just with r3 instances, or with any instance type?

2014년 6월 1일 일요일, Jeremy Lee<un...@gmail.com>님이 작성한 메시지:

> It's been another day of spinning up dead clusters...
>
> I thought I'd finally worked out what everyone else knew - don't use the
> default AMI - but I've now run through all of the "official" quick-start
> linux releases and I'm none the wiser:
>
> Amazon Linux AMI 2014.03.1 - ami-7aba833f (64-bit)
> Provisions servers, connects, installs, but the webserver on the master
> will not start
>
> Red Hat Enterprise Linux 6.5 (HVM) - ami-5cdce419
> Spot instance requests are not supported for this AMI.
>
> SuSE Linux Enterprise Server 11 sp3 (HVM) - ami-1a88bb5f
> Not tested - costs 10x more for spot instances, not economically viable.
>
> Ubuntu Server 14.04 LTS (HVM) - ami-f64f77b3
> Provisions servers, but "git" is not pre-installed, so the cluster setup
> fails.
>
> Amazon Linux AMI (HVM) 2014.03.1 - ami-5aba831f
> Provisions servers, but "git" is not pre-installed, so the cluster setup
> fails.
>
> Have I missed something? What AMI's are people using? I've just gone back
> through the archives, and I'm seeing a lot of "I can't get EC2 to work" and
> not a single "My EC2 has post-install issues",
>
> The quickstart page says "...can have a spark cluster up and running in
> five minutes." But it's been three days for me so far. I'm about to bite
> the bullet and start building my own AMI's from scratch... if anyone can
> save me from that, I'd be most grateful.
>
> --
> Jeremy Lee  BCompSci(Hons)
>   The Unorthodox Engineers
>

Re: Yay for 1.0.0! EC2 Still has problems.

Posted by Jeremy Lee <un...@gmail.com>.

It's been another day of spinning up dead clusters...

I thought I'd finally worked out what everyone else knew - don't use the
default AMI - but I've now run through all of the "official" quick-start
linux releases and I'm none the wiser:

Amazon Linux AMI 2014.03.1 - ami-7aba833f (64-bit)
Provisions servers, connects, installs, but the webserver on the master
will not start

Red Hat Enterprise Linux 6.5 (HVM) - ami-5cdce419
Spot instance requests are not supported for this AMI.

SuSE Linux Enterprise Server 11 sp3 (HVM) - ami-1a88bb5f
Not tested - costs 10x more for spot instances, not economically viable.

Ubuntu Server 14.04 LTS (HVM) - ami-f64f77b3
Provisions servers, but "git" is not pre-installed, so the cluster setup
fails.

Amazon Linux AMI (HVM) 2014.03.1 - ami-5aba831f
Provisions servers, but "git" is not pre-installed, so the cluster setup
fails.

Have I missed something? What AMI's are people using? I've just gone back
through the archives, and I'm seeing a lot of "I can't get EC2 to work" and
not a single "My EC2 has post-install issues",

The quickstart page says "...can have a spark cluster up and running in
five minutes." But it's been three days for me so far. I'm about to bite
the bullet and start building my own AMI's from scratch... if anyone can
save me from that, I'd be most grateful.

-- 
Jeremy Lee  BCompSci(Hons)
  The Unorthodox Engineers

Re: Yay for 1.0.0! EC2 Still has problems.

Posted by Jeremy Lee <un...@gmail.com>.

Oh, sorry, I forgot to add: here are the extra lines in my spark_ec2.py

@205
   "r3.large":    "hvm",
    "r3.xlarge":   "hvm",
    "r3.2xlarge":  "hvm",
    "r3.4xlarge":  "hvm",
    "r3.8xlarge":  "hvm"

Clearly a masterpiece of hacking. :-) I haven't tested all of them.  The r3
set seems to act like i2.



On Sun, Jun 1, 2014 at 12:45 AM, Jeremy Lee <un...@gmail.com>
wrote:

> Hi there, Patrick. Thanks for the reply...
>
> It wouldn't surprise me that AWS Ubuntu has Python 2.7. Ubuntu is cool
> like that. :-)
>
> Alas, the Amazon Linux AMI (2014.03.1) does not, and it's the very first
> one on the recommended instance list. (Ubuntu is #4, after Amazon, RedHat,
> SUSE) So, users such as myself who deliberately pick the "Most Amazon-ish
> obvious first choice" find they picked the wrong one.
>
> But that's trivial compared to the failure of the cluster to come up,
> apparently due to the master's http configuration. Any help on that would
> be much appreciated... it's giving me serious grief.
>
>
>
> On Sat, May 31, 2014 at 1:37 PM, Patrick Wendell <pw...@gmail.com>
> wrote:
>
>> Hi Jeremy,
>>
>> That's interesting, I don't think anyone has ever reported an issue
>> running these scripts due to Python incompatibility, but they may require
>> Python 2.7+. I regularly run them from the AWS Ubuntu 12.04 AMI... that
>> might be a good place to start. But if there is a straightforward way to
>> make them compatible with 2.6 we should do that.
>>
>> For r3.large, we can add that to the script. It's a newer type. Any
>> interest in contributing this?
>>
>> - Patrick
>>
>> On May 30, 2014 5:08 AM, "Jeremy Lee" <un...@gmail.com>
>> wrote:
>>
>>>
>>> Hi there! I'm relatively new to the list, so sorry if this is a repeat:
>>>
>>> I just wanted to mention there are still problems with the EC2 scripts.
>>> Basically, they don't work.
>>>
>>> First, if you run the scripts on Amazon's own suggested version of
>>> linux, they break because amazon installs Python2.6.9, and the scripts use
>>> a couple of Python2.7 commands. I have to "sudo yum install python27", and
>>> then edit the spark-ec2 shell script to use that specific version.
>>> Annoying, but minor.
>>>
>>> (the base "python" command isn't upgraded to 2.7 on many systems,
>>> apparently because it would break yum)
>>>
>>> The second minor problem is that the script doesn't know about the
>>> "r3.large" servers... also easily fixed by adding to the spark_ec2.py
>>> script. Minor,
>>>
>>> The big problem is that after the EC2 cluster is provisioned, installed,
>>> set up, and everything, it fails to start up the webserver on the master.
>>> Here's the tail of the log:
>>>
>>> Starting GANGLIA gmond:                                    [  OK  ]
>>> Shutting down GANGLIA gmond:                               [FAILED]
>>> Starting GANGLIA gmond:                                    [  OK  ]
>>> Connection to ec2-54-183-82-48.us-west-1.compute.amazonaws.com closed.
>>> Shutting down GANGLIA gmond:                               [FAILED]
>>> Starting GANGLIA gmond:                                    [  OK  ]
>>> Connection to ec2-54-183-82-24.us-west-1.compute.amazonaws.com closed.
>>> Shutting down GANGLIA gmetad:                              [FAILED]
>>> Starting GANGLIA gmetad:                                   [  OK  ]
>>> Stopping httpd:                                            [FAILED]
>>> Starting httpd: httpd: Syntax error on line 153 of
>>> /etc/httpd/conf/httpd.conf: Cannot load modules/mod_authn_alias.so into
>>> server: /etc/httpd/modules/mod_authn_alias.so: cannot open shared object
>>> file: No such file or directory
>>>                                                            [FAILED]
>>>
>>> Basically, the AMI you have chosen does not seem to have a "full"
>>> install of apache, and is missing several modules that are referred to in
>>> the httpd.conf file that is installed. The full list of missing modules is:
>>>
>>> authn_alias_module modules/mod_authn_alias.so
>>> authn_default_module modules/mod_authn_default.so
>>> authz_default_module modules/mod_authz_default.so
>>> ldap_module modules/mod_ldap.so
>>> authnz_ldap_module modules/mod_authnz_ldap.so
>>> disk_cache_module modules/mod_disk_cache.so
>>>
>>> Alas, even if these modules are commented out, the server still fails to
>>> start.
>>>
>>> root@ip-172-31-11-193 ~]$ service httpd start
>>> Starting httpd: AH00534: httpd: Configuration error: No MPM loaded.
>>>
>>> That means Spark 1.0.0 clusters on EC2 are Dead-On-Arrival when run
>>> according to the instructions. Sorry.
>>>
>>> Any suggestions on how to proceed? I'll keep trying to fix the
>>> webserver, but (a) changes to httpd.conf get blown away by "resume", and
>>> (b) anything I do has to be redone every time I provision another cluster.
>>> Ugh.
>>>
>>> --
>>> Jeremy Lee  BCompSci(Hons)
>>>   The Unorthodox Engineers
>>>
>>
>>
>>
>>
>>
>
>
> --
> Jeremy Lee  BCompSci(Hons)
>   The Unorthodox Engineers
>



-- 
Jeremy Lee  BCompSci(Hons)
  The Unorthodox Engineers

Re: Yay for 1.0.0! EC2 Still has problems.

Posted by Jeremy Lee <un...@gmail.com>.

Hi there, Patrick. Thanks for the reply...

It wouldn't surprise me that AWS Ubuntu has Python 2.7. Ubuntu is cool like
that. :-)

Alas, the Amazon Linux AMI (2014.03.1) does not, and it's the very first
one on the recommended instance list. (Ubuntu is #4, after Amazon, RedHat,
SUSE) So, users such as myself who deliberately pick the "Most Amazon-ish
obvious first choice" find they picked the wrong one.

But that's trivial compared to the failure of the cluster to come up,
apparently due to the master's http configuration. Any help on that would
be much appreciated... it's giving me serious grief.



On Sat, May 31, 2014 at 1:37 PM, Patrick Wendell <pw...@gmail.com> wrote:

> Hi Jeremy,
>
> That's interesting, I don't think anyone has ever reported an issue
> running these scripts due to Python incompatibility, but they may require
> Python 2.7+. I regularly run them from the AWS Ubuntu 12.04 AMI... that
> might be a good place to start. But if there is a straightforward way to
> make them compatible with 2.6 we should do that.
>
> For r3.large, we can add that to the script. It's a newer type. Any
> interest in contributing this?
>
> - Patrick
>
> On May 30, 2014 5:08 AM, "Jeremy Lee" <un...@gmail.com>
> wrote:
>
>>
>> Hi there! I'm relatively new to the list, so sorry if this is a repeat:
>>
>> I just wanted to mention there are still problems with the EC2 scripts.
>> Basically, they don't work.
>>
>> First, if you run the scripts on Amazon's own suggested version of linux,
>> they break because amazon installs Python2.6.9, and the scripts use a
>> couple of Python2.7 commands. I have to "sudo yum install python27", and
>> then edit the spark-ec2 shell script to use that specific version.
>> Annoying, but minor.
>>
>> (the base "python" command isn't upgraded to 2.7 on many systems,
>> apparently because it would break yum)
>>
>> The second minor problem is that the script doesn't know about the
>> "r3.large" servers... also easily fixed by adding to the spark_ec2.py
>> script. Minor,
>>
>> The big problem is that after the EC2 cluster is provisioned, installed,
>> set up, and everything, it fails to start up the webserver on the master.
>> Here's the tail of the log:
>>
>> Starting GANGLIA gmond:                                    [  OK  ]
>> Shutting down GANGLIA gmond:                               [FAILED]
>> Starting GANGLIA gmond:                                    [  OK  ]
>> Connection to ec2-54-183-82-48.us-west-1.compute.amazonaws.com closed.
>> Shutting down GANGLIA gmond:                               [FAILED]
>> Starting GANGLIA gmond:                                    [  OK  ]
>> Connection to ec2-54-183-82-24.us-west-1.compute.amazonaws.com closed.
>> Shutting down GANGLIA gmetad:                              [FAILED]
>> Starting GANGLIA gmetad:                                   [  OK  ]
>> Stopping httpd:                                            [FAILED]
>> Starting httpd: httpd: Syntax error on line 153 of
>> /etc/httpd/conf/httpd.conf: Cannot load modules/mod_authn_alias.so into
>> server: /etc/httpd/modules/mod_authn_alias.so: cannot open shared object
>> file: No such file or directory
>>                                                            [FAILED]
>>
>> Basically, the AMI you have chosen does not seem to have a "full" install
>> of apache, and is missing several modules that are referred to in the
>> httpd.conf file that is installed. The full list of missing modules is:
>>
>> authn_alias_module modules/mod_authn_alias.so
>> authn_default_module modules/mod_authn_default.so
>> authz_default_module modules/mod_authz_default.so
>> ldap_module modules/mod_ldap.so
>> authnz_ldap_module modules/mod_authnz_ldap.so
>> disk_cache_module modules/mod_disk_cache.so
>>
>> Alas, even if these modules are commented out, the server still fails to
>> start.
>>
>> root@ip-172-31-11-193 ~]$ service httpd start
>> Starting httpd: AH00534: httpd: Configuration error: No MPM loaded.
>>
>> That means Spark 1.0.0 clusters on EC2 are Dead-On-Arrival when run
>> according to the instructions. Sorry.
>>
>> Any suggestions on how to proceed? I'll keep trying to fix the webserver,
>> but (a) changes to httpd.conf get blown away by "resume", and (b) anything
>> I do has to be redone every time I provision another cluster. Ugh.
>>
>> --
>> Jeremy Lee  BCompSci(Hons)
>>   The Unorthodox Engineers
>>
>
>
>
>
>


-- 
Jeremy Lee  BCompSci(Hons)
  The Unorthodox Engineers

Re: Yay for 1.0.0! EC2 Still has problems.

Posted by nit <ni...@gmail.com>.

I am also running into "modules/mod_authn_alias.so"  issue on r3.8xlarge when
launched cluster with ./spark-ec2; so ganglia is not accessible. From the
posts it seems that Patrick suggested using Ubuntu 12.04. Can you please
provide name of AMI  that can be used with -a flag that will not have this
issue? 

- I am running script with
"--spark-git-repo=https://github.com/apache/spark", which I assume should
deploy the latest code. 

- I have been able to launch cluster on  m2.4xlarge, where ganglia works. 

- From what I understand we are not supposed to use any random AMI??; it
will be helpful to publish list of AMIS that people use with different
instances.



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Yay-for-1-0-0-EC2-Still-has-problems-tp6578p9307.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Yay for 1.0.0! EC2 Still has problems.

Posted by Patrick Wendell <pw...@gmail.com>.

Hi Jeremy,

That's interesting, I don't think anyone has ever reported an issue running
these scripts due to Python incompatibility, but they may require Python
2.7+. I regularly run them from the AWS Ubuntu 12.04 AMI... that might be a
good place to start. But if there is a straightforward way to make them
compatible with 2.6 we should do that.

For r3.large, we can add that to the script. It's a newer type. Any
interest in contributing this?

- Patrick

On May 30, 2014 5:08 AM, "Jeremy Lee" <un...@gmail.com>
wrote:

>
> Hi there! I'm relatively new to the list, so sorry if this is a repeat:
>
> I just wanted to mention there are still problems with the EC2 scripts.
> Basically, they don't work.
>
> First, if you run the scripts on Amazon's own suggested version of linux,
> they break because amazon installs Python2.6.9, and the scripts use a
> couple of Python2.7 commands. I have to "sudo yum install python27", and
> then edit the spark-ec2 shell script to use that specific version.
> Annoying, but minor.
>
> (the base "python" command isn't upgraded to 2.7 on many systems,
> apparently because it would break yum)
>
> The second minor problem is that the script doesn't know about the
> "r3.large" servers... also easily fixed by adding to the spark_ec2.py
> script. Minor,
>
> The big problem is that after the EC2 cluster is provisioned, installed,
> set up, and everything, it fails to start up the webserver on the master.
> Here's the tail of the log:
>
> Starting GANGLIA gmond:                                    [  OK  ]
> Shutting down GANGLIA gmond:                               [FAILED]
> Starting GANGLIA gmond:                                    [  OK  ]
> Connection to ec2-54-183-82-48.us-west-1.compute.amazonaws.com closed.
> Shutting down GANGLIA gmond:                               [FAILED]
> Starting GANGLIA gmond:                                    [  OK  ]
> Connection to ec2-54-183-82-24.us-west-1.compute.amazonaws.com closed.
> Shutting down GANGLIA gmetad:                              [FAILED]
> Starting GANGLIA gmetad:                                   [  OK  ]
> Stopping httpd:                                            [FAILED]
> Starting httpd: httpd: Syntax error on line 153 of
> /etc/httpd/conf/httpd.conf: Cannot load modules/mod_authn_alias.so into
> server: /etc/httpd/modules/mod_authn_alias.so: cannot open shared object
> file: No such file or directory
>                                                            [FAILED]
>
> Basically, the AMI you have chosen does not seem to have a "full" install
> of apache, and is missing several modules that are referred to in the
> httpd.conf file that is installed. The full list of missing modules is:
>
> authn_alias_module modules/mod_authn_alias.so
> authn_default_module modules/mod_authn_default.so
> authz_default_module modules/mod_authz_default.so
> ldap_module modules/mod_ldap.so
> authnz_ldap_module modules/mod_authnz_ldap.so
> disk_cache_module modules/mod_disk_cache.so
>
> Alas, even if these modules are commented out, the server still fails to
> start.
>
> root@ip-172-31-11-193 ~]$ service httpd start
> Starting httpd: AH00534: httpd: Configuration error: No MPM loaded.
>
> That means Spark 1.0.0 clusters on EC2 are Dead-On-Arrival when run
> according to the instructions. Sorry.
>
> Any suggestions on how to proceed? I'll keep trying to fix the webserver,
> but (a) changes to httpd.conf get blown away by "resume", and (b) anything
> I do has to be redone every time I provision another cluster. Ugh.
>
> --
> Jeremy Lee  BCompSci(Hons)
>   The Unorthodox Engineers
>

Re: Yay for 1.0.0! EC2 Still has problems.

Posted by nit <ni...@gmail.com>.

I am also running into "modules/mod_authn_alias.so"  issue on r3.8xlarge when
launched cluster with ./spark-ec2; so ganglia is not accessible. From the
posts it seems that Patrick suggested using Ubuntu 12.04. Can you please
provide name of AMI  that can be used with -a flag that will not have this
issue?

- I am running script with
"--spark-git-repo=https://github.com/apache/spark", which I assume should
deploy the latest code.

- I have been able to launch cluster on  m2.4xlarge, where ganglia works.

- From what I understand we are not supposed to use any random AMI??; it
will be helpful to publish list of AMIS that people use with different
instances.



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Yay-for-1-0-0-EC2-Still-has-problems-tp6578p9306.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.