You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Aureliano Buendia <bu...@gmail.com> on 2014/01/08 04:21:00 UTC

EC2 scripts documentations lacks how to actually run applications

Hi,

The EC2 documents<http://spark.incubator.apache.org/docs/0.8.1/ec2-scripts.html>has
a section called 'Running Applications', but it actually lacks the
step
which should describe how to run the application.

The spark_ec2 script<https://github.com/apache/incubator-spark/blob/59e8009b8d5e51b6f776720de8c9ecb09e1072dc/ec2/spark_ec2.py>seems
to set up a standalone cluster, although I'm not sure why AMI_PREFIX
point to mesos ami
list<https://github.com/apache/incubator-spark/blob/59e8009b8d5e51b6f776720de8c9ecb09e1072dc/ec2/spark_ec2.py#L44>
.

Assuming that the cluster type is standalone, we could run the app by
spark-class script. Is this the missing step in the documentations?

spark-class script does not launch a daemon, is it suppose to be used with
nohup for long running applications?

Finally, is the standalone cluster type used for real world applications,
or do people use spark on yarn and mesos when it comes to production?

Re: EC2 scripts documentations lacks how to actually run applications

Posted by Aureliano Buendia <bu...@gmail.com>.
Thanks Patrick. I take it that spark does not come with a daemonizer, and
the user is responsible for that.

Having said that, it feels odd that spark doesn't come with an application
daemonizer out of the box. I know spark is fast, but it's not _that_ fast
to not need a daemonizer :)


On Wed, Jan 8, 2014 at 7:57 PM, Patrick Wendell <pw...@gmail.com> wrote:

> Hey Aureliano,
>
> Yes, people run long running applications with standalone mode and run
> it in production. spark-class is a utility function for convenience.
> If you want to run a long running application you would write a spark
> application, bundle it, and submit it to the cluster. You can then
> launch your own application with no-hup or however else you want to
> daemonize it.
>
> Here is an example of a standalone application.
>
> http://spark.incubator.apache.org/docs/latest/quick-start.html
>
> The pull request Mark referred to adds some support for submitting
> your driver program to the cluster... but it's just an extra feature.
> Launching packaged applications is the way you want to go for your use
> case.
>
> - Patrick
>
> On Wed, Jan 8, 2014 at 10:31 AM, Mark Hamstra <ma...@clearstorydata.com>
> wrote:
> > https://github.com/apache/incubator-spark/pull/293
> >
> >
> > On Wed, Jan 8, 2014 at 10:12 AM, Aureliano Buendia <buendia360@gmail.com
> >
> > wrote:
> >>
> >> Here is a refactored version of the question:
> >>
> >> How to run spark-class for long running applications? Why is that
> >> spark-class doesn't launch a daemon?
> >>
> >>
> >> On Wed, Jan 8, 2014 at 3:21 AM, Aureliano Buendia <buendia360@gmail.com
> >
> >> wrote:
> >>>
> >>> Hi,
> >>>
> >>> The EC2 documents has a section called 'Running Applications', but it
> >>> actually lacks the step which should describe how to run the
> application.
> >>>
> >>> The spark_ec2 script seems to set up a standalone cluster, although I'm
> >>> not sure why AMI_PREFIX point to mesos ami list.
> >>>
> >>> Assuming that the cluster type is standalone, we could run the app by
> >>> spark-class script. Is this the missing step in the documentations?
> >>>
> >>> spark-class script does not launch a daemon, is it suppose to be used
> >>> with nohup for long running applications?
> >>>
> >>> Finally, is the standalone cluster type used for real world
> applications,
> >>> or do people use spark on yarn and mesos when it comes to production?
> >>
> >>
> >
>

Re: EC2 scripts documentations lacks how to actually run applications

Posted by Patrick Wendell <pw...@gmail.com>.
Hey Aureliano,

Yes, people run long running applications with standalone mode and run
it in production. spark-class is a utility function for convenience.
If you want to run a long running application you would write a spark
application, bundle it, and submit it to the cluster. You can then
launch your own application with no-hup or however else you want to
daemonize it.

Here is an example of a standalone application.

http://spark.incubator.apache.org/docs/latest/quick-start.html

The pull request Mark referred to adds some support for submitting
your driver program to the cluster... but it's just an extra feature.
Launching packaged applications is the way you want to go for your use
case.

- Patrick

On Wed, Jan 8, 2014 at 10:31 AM, Mark Hamstra <ma...@clearstorydata.com> wrote:
> https://github.com/apache/incubator-spark/pull/293
>
>
> On Wed, Jan 8, 2014 at 10:12 AM, Aureliano Buendia <bu...@gmail.com>
> wrote:
>>
>> Here is a refactored version of the question:
>>
>> How to run spark-class for long running applications? Why is that
>> spark-class doesn't launch a daemon?
>>
>>
>> On Wed, Jan 8, 2014 at 3:21 AM, Aureliano Buendia <bu...@gmail.com>
>> wrote:
>>>
>>> Hi,
>>>
>>> The EC2 documents has a section called 'Running Applications', but it
>>> actually lacks the step which should describe how to run the application.
>>>
>>> The spark_ec2 script seems to set up a standalone cluster, although I'm
>>> not sure why AMI_PREFIX point to mesos ami list.
>>>
>>> Assuming that the cluster type is standalone, we could run the app by
>>> spark-class script. Is this the missing step in the documentations?
>>>
>>> spark-class script does not launch a daemon, is it suppose to be used
>>> with nohup for long running applications?
>>>
>>> Finally, is the standalone cluster type used for real world applications,
>>> or do people use spark on yarn and mesos when it comes to production?
>>
>>
>

Re: EC2 scripts documentations lacks how to actually run applications

Posted by Mark Hamstra <ma...@clearstorydata.com>.
https://github.com/apache/incubator-spark/pull/293


On Wed, Jan 8, 2014 at 10:12 AM, Aureliano Buendia <bu...@gmail.com>wrote:

> Here is a refactored version of the question:
>
> How to run spark-class for long running applications? Why is that
> spark-class doesn't launch a daemon?
>
>
> On Wed, Jan 8, 2014 at 3:21 AM, Aureliano Buendia <bu...@gmail.com>wrote:
>
>> Hi,
>>
>> The EC2 documents<http://spark.incubator.apache.org/docs/0.8.1/ec2-scripts.html>has a section called 'Running Applications', but it actually lacks the step
>> which should describe how to run the application.
>>
>> The spark_ec2 script<https://github.com/apache/incubator-spark/blob/59e8009b8d5e51b6f776720de8c9ecb09e1072dc/ec2/spark_ec2.py>seems to set up a standalone cluster, although I'm not sure why AMI_PREFIX
>> point to mesos ami list<https://github.com/apache/incubator-spark/blob/59e8009b8d5e51b6f776720de8c9ecb09e1072dc/ec2/spark_ec2.py#L44>
>> .
>>
>> Assuming that the cluster type is standalone, we could run the app by
>> spark-class script. Is this the missing step in the documentations?
>>
>> spark-class script does not launch a daemon, is it suppose to be used
>> with nohup for long running applications?
>>
>> Finally, is the standalone cluster type used for real world applications,
>> or do people use spark on yarn and mesos when it comes to production?
>>
>
>

Re: EC2 scripts documentations lacks how to actually run applications

Posted by Aureliano Buendia <bu...@gmail.com>.
Here is a refactored version of the question:

How to run spark-class for long running applications? Why is that
spark-class doesn't launch a daemon?


On Wed, Jan 8, 2014 at 3:21 AM, Aureliano Buendia <bu...@gmail.com>wrote:

> Hi,
>
> The EC2 documents<http://spark.incubator.apache.org/docs/0.8.1/ec2-scripts.html>has a section called 'Running Applications', but it actually lacks the step
> which should describe how to run the application.
>
> The spark_ec2 script<https://github.com/apache/incubator-spark/blob/59e8009b8d5e51b6f776720de8c9ecb09e1072dc/ec2/spark_ec2.py>seems to set up a standalone cluster, although I'm not sure why AMI_PREFIX
> point to mesos ami list<https://github.com/apache/incubator-spark/blob/59e8009b8d5e51b6f776720de8c9ecb09e1072dc/ec2/spark_ec2.py#L44>
> .
>
> Assuming that the cluster type is standalone, we could run the app by
> spark-class script. Is this the missing step in the documentations?
>
> spark-class script does not launch a daemon, is it suppose to be used with
> nohup for long running applications?
>
> Finally, is the standalone cluster type used for real world applications,
> or do people use spark on yarn and mesos when it comes to production?
>