You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by karthik padmanabhan <tr...@gmail.com> on 2017/09/11 03:27:03 UTC

Supporting Apache Aurora as a cluster manager

Hi Spark Devs,

We are using Aurora (http://aurora.apache.org/) as our mesos framework for
running stateless services. We would like to use Aurora to deploy big data
and batch workloads as well. And for this we have forked Spark and
implement the ExternalClusterManager trait.

The reason for doing this and not running Spark on Mesos is to leverage the
existing roles and quotas provided by Aurora for admission control and also
leverage Aurora features such as priority and preemption. Additionally we
would like Aurora to be only deploy/orchestration system that our users
should interact with.

We have a working POC where Spark is launching jobs through as the
ClusterManager. Is this something that can be merged upstream ? If so I can
create a design document and create an associated jira ticket.

Thanks
Karthik

Re: Supporting Apache Aurora as a cluster manager

Posted by karthik padmanabhan <tr...@gmail.com>.
Hi Mark,

Thanks for getting back. I think you raise a very valid point about getting
into a plug-in base architecture instead of supporting the idiosyncrasies
of different schedulers. Yeah let me write a design doc so that it will at
least be another data point for how we think about the plug-in architecture
discussed in SPARK-19700.

Thanks
Karthik

On Sun, Sep 10, 2017 at 11:02 PM, Mark Hamstra <ma...@clearstorydata.com>
wrote:

> While it may be worth creating the design doc and JIRA ticket so that we
> at least have a better idea and a record of what you are talking about, I
> kind of doubt that we are going to want to merge this into the Spark
> codebase. That's not because of anything specific to this Aurora effort,
> but rather because scheduler implementations in general are not going in
> the preferred direction. There is already some regret that the YARN
> scheduler wasn't implemented by means of a scheduler plug-in API, and there
> is likely to be more regret if we continue to go forward with the
> spark-on-kubernetes SPIP in its present form. I'd guess that we are likely
> to merge code associated with that SPIP just because Kubernetes has become
> such an important resource scheduler, but such a merge wouldn't be without
> some misgivings. That is because we just can't get into the position of
> having more and more scheduler implementations in the Spark code, and more
> and more maintenance overhead to keep up with the idiosyncrasies of all the
> scheduler implementations. We've really got to get to the kind of plug-in
> architecture discussed in SPARK-19700 so that scheduler implementations can
> be done outside of the Spark codebase, release schedule, etc.
>
> My opinion on the subject isn't dispositive on its own, of course, but
> that is how I'm seeing things right now.
>
> On Sun, Sep 10, 2017 at 8:27 PM, karthik padmanabhan <
> treadstone90@gmail.com> wrote:
>
>> Hi Spark Devs,
>>
>> We are using Aurora (http://aurora.apache.org/) as our mesos framework
>> for running stateless services. We would like to use Aurora to deploy big
>> data and batch workloads as well. And for this we have forked Spark and
>> implement the ExternalClusterManager trait.
>>
>> The reason for doing this and not running Spark on Mesos is to leverage
>> the existing roles and quotas provided by Aurora for admission control and
>> also leverage Aurora features such as priority and preemption. Additionally
>> we would like Aurora to be only deploy/orchestration system that our users
>> should interact with.
>>
>> We have a working POC where Spark is launching jobs through as the
>> ClusterManager. Is this something that can be merged upstream ? If so I can
>> create a design document and create an associated jira ticket.
>>
>> Thanks
>> Karthik
>>
>
>

Re: Supporting Apache Aurora as a cluster manager

Posted by Mark Hamstra <ma...@clearstorydata.com>.
While it may be worth creating the design doc and JIRA ticket so that we at
least have a better idea and a record of what you are talking about, I kind
of doubt that we are going to want to merge this into the Spark codebase.
That's not because of anything specific to this Aurora effort, but rather
because scheduler implementations in general are not going in the preferred
direction. There is already some regret that the YARN scheduler wasn't
implemented by means of a scheduler plug-in API, and there is likely to be
more regret if we continue to go forward with the spark-on-kubernetes SPIP
in its present form. I'd guess that we are likely to merge code associated
with that SPIP just because Kubernetes has become such an important
resource scheduler, but such a merge wouldn't be without some misgivings.
That is because we just can't get into the position of having more and more
scheduler implementations in the Spark code, and more and more maintenance
overhead to keep up with the idiosyncrasies of all the scheduler
implementations. We've really got to get to the kind of plug-in
architecture discussed in SPARK-19700 so that scheduler implementations can
be done outside of the Spark codebase, release schedule, etc.

My opinion on the subject isn't dispositive on its own, of course, but that
is how I'm seeing things right now.

On Sun, Sep 10, 2017 at 8:27 PM, karthik padmanabhan <treadstone90@gmail.com
> wrote:

> Hi Spark Devs,
>
> We are using Aurora (http://aurora.apache.org/) as our mesos framework
> for running stateless services. We would like to use Aurora to deploy big
> data and batch workloads as well. And for this we have forked Spark and
> implement the ExternalClusterManager trait.
>
> The reason for doing this and not running Spark on Mesos is to leverage
> the existing roles and quotas provided by Aurora for admission control and
> also leverage Aurora features such as priority and preemption. Additionally
> we would like Aurora to be only deploy/orchestration system that our users
> should interact with.
>
> We have a working POC where Spark is launching jobs through as the
> ClusterManager. Is this something that can be merged upstream ? If so I can
> create a design document and create an associated jira ticket.
>
> Thanks
> Karthik
>