You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mesos.apache.org by Dan Colish <dc...@urbanairship.com> on 2013/09/27 18:00:23 UTC

Aurora, Marathon and long lived job frameworks

I have been working on an internal project for executing a large number of
jobs across a cluster for the past couple of months and I am currently
doing a spike on using mesos for some of the cluster management tasks. The
clear prior art winners are Aurora and Marathon, but in both cases they
fall short of what I need.

In aurora's case, the software is clearly very early in the open sourcing
process and as a result it missing significant pieces. The biggest missing
piece is the actual execution framework, Thermos. [That is what I assume
thermos does. I have no internal knowledge to verify that assumption]
Additionally, Aurora is heavily optimized for a high user count and large
number of incoming jobs. My use case is much simpler. There is only one
effective user and we have a small known set of jobs which need to run.

On the other hand, Marathon is not designed for job execution if job is
defined to be a smaller unit of work. Instead, Marathon self-describes as a
meta-framework for deploying frameworks to a mesos cluster. A job to
marathon is the framework that runs. I do not think Marathon would be a
good fit for managing the my task execution and retry logic. It is designed
to run at on as a sub-layer of the cluster's resource allocation scheduler
and its abstractions follow suit.

For my needs Aurora does appear to be a much closer fit than Marathon, but
neither is ideal. Since that is the case, I find myself left with a rough
choice. I am not thrilled with the prospect of yet another framework for
Mesos, but there is a lot of work which I have already completed for my
internal project that would need to reworked to fit with Aurora. Currently
my project can support the following features.

* Distributed job locking - jobs cannot overlap
* Job execution delay queue - jobs can be run immediately or after a delay
* Job preemption
* Job success/failure tracking
* Garbage collection of dead jobs
* Job execution failover - job is retried on a new executor
* Executor warming - min # of executors idle
* Executor limits - max # of executors available

My plan for integration with mesos is to adapt the job manager into a mesos
scheduler and my execution slaves into a mesos executor. At that point, my
framework will be able to run on the mesos cluster, but I have a few
concerns about how to allocated and release resources that the executors
will use over the lifetime of the cluster. I am not sure whether it is
better to be greedy early on in the frameworks life-cycle or to decline
resources initially and scale the framework's slaves when jobs start coming
in. Additionally, the relationship between the executor and its associated
driver are not immediately clear to me. If I am reading the code correctly,
they do not provide a way to stop a task in progress short of killing the
executor process.

I think that mesos will be a nice feature to add to my project and I would
really appreciate any feedback from the community. I will provide progress
updates as I continue work on my experiments.

Re: Aurora, Marathon and long lived job frameworks

Posted by Bill Farner <bi...@twitter.com>.
Ben pretty accurately described how Aurora fills some of these duties, but
Dan is right — we're still on the cusp of being *really* open sourced, so
it's not very usable yet.  Once our incubator vote is over, i hope to
promptly change this so outside users and contributors can dive in.

-=Bill


On Fri, Sep 27, 2013 at 5:21 PM, Benjamin Mahler
<be...@gmail.com>wrote:

> I've replied inline below, also cc'ed some of the Aurora / Thermos
> developers to better answer your questions.
>
> On Fri, Sep 27, 2013 at 9:00 AM, Dan Colish <dc...@urbanairship.com>wrote:
>
>> I have been working on an internal project for executing a large number
>> of jobs across a cluster for the past couple of months and I am currently
>> doing a spike on using mesos for some of the cluster management tasks. The
>> clear prior art winners are Aurora and Marathon, but in both cases they
>> fall short of what I need.
>>
>> In aurora's case, the software is clearly very early in the open sourcing
>> process and as a result it missing significant pieces. The biggest missing
>> piece is the actual execution framework, Thermos. [That is what I assume
>> thermos does. I have no internal knowledge to verify that assumption]
>> Additionally, Aurora is heavily optimized for a high user count and large
>> number of incoming jobs. My use case is much simpler. There is only one
>> effective user and we have a small known set of jobs which need to run.
>>
>> On the other hand, Marathon is not designed for job execution if job is
>> defined to be a smaller unit of work. Instead, Marathon self-describes as a
>> meta-framework for deploying frameworks to a mesos cluster. A job to
>> marathon is the framework that runs. I do not think Marathon would be a
>> good fit for managing the my task execution and retry logic. It is designed
>> to run at on as a sub-layer of the cluster's resource allocation scheduler
>> and its abstractions follow suit.
>>
>> For my needs Aurora does appear to be a much closer fit than Marathon,
>> but neither is ideal. Since that is the case, I find myself left with a
>> rough choice. I am not thrilled with the prospect of yet another framework
>> for Mesos, but there is a lot of work which I have already completed for my
>> internal project that would need to reworked to fit with Aurora. Currently
>> my project can support the following features.
>>
>> * Distributed job locking - jobs cannot overlap
>>
>
> Can you elaborate on the ways in which jobs cannot overlap? Aurora may
> provide what you need here.
>
>
>> * Job execution delay queue - jobs can be run immediately or after a delay
>> * Job preemption
>>
>
> Aurora has the concept of pre-emption when there are insufficient
> resources for "production" jobs to run. I will defer to Aurora devs on
> elaborating here.
>
>
>> * Job success/failure tracking
>>
>
> What kind of tracking?
>
>
>> * Garbage collection of dead jobs
>>
>
> This is present in Aurora. Eventually, a completed job will be purged from
> the entire system. What kind of garbage collection are you referring to?
>
>
>> * Job execution failover - job is retried on a new executor
>>
>
> In Aurora, jobs are restarted if they fail.
>
>
>> * Executor warming - min # of executors idle
>> * Executor limits - max # of executors available
>>
>> My plan for integration with mesos is to adapt the job manager into a
>> mesos scheduler and my execution slaves into a mesos executor. At that
>> point, my framework will be able to run on the mesos cluster, but I have a
>> few concerns about how to allocated and release resources that the
>> executors will use over the lifetime of the cluster. I am not sure whether
>> it is better to be greedy early on in the frameworks life-cycle or to
>> decline resources initially and scale the framework's slaves when jobs
>> start coming in.
>>
>
> The better design would be to use resources as you need them (rather than
> greedily holding onto offers). Is there any motivation for the greedy
> approach?
>
>
>> Additionally, the relationship between the executor and its associated
>> driver are not immediately clear to me. If I am reading the code correctly,
>> they do not provide a way to stop a task in progress short of killing the
>> executor process.
>>
>
> You can kill tasks in progress, the executor process receives the request
> to kill the task. See SchedulerDriver::killTask and Executor::killTask
> (scheduler.hpp and executor.hpp).
>
>
>> I think that mesos will be a nice feature to add to my project and I
>> would really appreciate any feedback from the community. I will provide
>> progress updates as I continue work on my experiments.
>>
>
>

Re: Aurora, Marathon and long lived job frameworks

Posted by Bill Farner <bi...@twitter.com>.
On Sat, Sep 28, 2013 at 1:16 PM, Dan Colish <dc...@urbanairship.com>wrote:

> On Fri, Sep 27, 2013 at 5:21 PM, Benjamin Mahler <
> benjamin.mahler@gmail.com> wrote:
>
>> I've replied inline below, also cc'ed some of the Aurora / Thermos
>> developers to better answer your questions.
>>
>
> Thank you very much! I'll respond inline as well.
>
>
>>
>> On Fri, Sep 27, 2013 at 9:00 AM, Dan Colish <dc...@urbanairship.com>wrote:
>>
>>> I have been working on an internal project for executing a large number
>>> of jobs across a cluster for the past couple of months and I am currently
>>> doing a spike on using mesos for some of the cluster management tasks. The
>>> clear prior art winners are Aurora and Marathon, but in both cases they
>>> fall short of what I need.
>>>
>>> In aurora's case, the software is clearly very early in the open
>>> sourcing process and as a result it missing significant pieces. The biggest
>>> missing piece is the actual execution framework, Thermos. [That is what I
>>> assume thermos does. I have no internal knowledge to verify that
>>> assumption] Additionally, Aurora is heavily optimized for a high user count
>>> and large number of incoming jobs. My use case is much simpler. There is
>>> only one effective user and we have a small known set of jobs which need to
>>> run.
>>>
>>> On the other hand, Marathon is not designed for job execution if job is
>>> defined to be a smaller unit of work. Instead, Marathon self-describes as a
>>> meta-framework for deploying frameworks to a mesos cluster. A job to
>>> marathon is the framework that runs. I do not think Marathon would be a
>>> good fit for managing the my task execution and retry logic. It is designed
>>> to run at on as a sub-layer of the cluster's resource allocation scheduler
>>> and its abstractions follow suit.
>>>
>>> For my needs Aurora does appear to be a much closer fit than Marathon,
>>> but neither is ideal. Since that is the case, I find myself left with a
>>> rough choice. I am not thrilled with the prospect of yet another framework
>>> for Mesos, but there is a lot of work which I have already completed for my
>>> internal project that would need to reworked to fit with Aurora. Currently
>>> my project can support the following features.
>>>
>>> * Distributed job locking - jobs cannot overlap
>>>
>>
>> Can you elaborate on the ways in which jobs cannot overlap? Aurora may
>> provide what you need here.
>>
>
> My use case requires the guarantee that only one instance of a job can be
> run at a given time. In my current implementation, I am using a distributed
> pessimistic offline lock to provide this guarantee. My lock implementation
> also guarantees modifications are atomic. Looking at Aurora, it follows a
> similar approach. Unfortunately, the MemStorarge is not appropriate for
> production usage and that is the only Storage implementation in Aurora
> which can provide the consistency I need for the locking strategy to work.
> It would be possible to include my lock in a persistent Storage
> implementation within Aurora, but I suspect that one already exists and
> could be made available eventually. If that is not the case, it might be
> worth talking more about how my implementation works and if there is any
> opportunity for collaboration.
>

Aurora currently errs on the side of availability (keeping at least N
instances up, converging towards N).  If we are signaled that a task is
LOST, we will aggressively replace it rather than (possibly hopelessly)
waiting for an indication that the task is confirmed dead.  So far we've
suggested that services with stricter mutual exclusion requirements be
defensive and build it into the app (e.g. a lock in a database or
ZooKeeper).

Don't be misled by MemStorage, it's just the in-memory portion of the
storage system.  All state required for a failover is persisted (see
LogStorage).


>
>>
>>> * Job execution delay queue - jobs can be run immediately or after a
>>> delay
>>> * Job preemption
>>>
>>
>> Aurora has the concept of pre-emption when there are insufficient
>> resources for "production" jobs to run. I will defer to Aurora devs on
>> elaborating here.
>>
>
> Excellent, my job preemption is actually focused on freeing up jobs which
> are running too slow or otherwise failing to make progress. It also
> provides part of the general job canceling mechanism. I've looked over the
> preemption code in Aurora and it also looks very similar.
>
>
>>
>>
>>> * Job success/failure tracking
>>>
>>
>> What kind of tracking?
>>
>
> I provide an interface for taking actions upon a job's completion. This
> callback can by a job designer for any number of uses. Tracking which jobs
> succeed and fail is the only use case I have for it now. There were some
> future ideas to use it for kicking of dependent jobs or taking other
> administrative actions.
>
>
>>
>>
>>> * Garbage collection of dead jobs
>>>
>>
>> This is present in Aurora. Eventually, a completed job will be purged
>> from the entire system. What kind of garbage collection are you referring
>> to?
>>
>
> That is exactly the kind of garbage collection I was speaking of. My
> system will remove jobs when they complete and clean up resources such as
> the locks taken. It will also preempt and remove jobs which are no longer
> making progress.
>
>
>>
>>
>>> * Job execution failover - job is retried on a new executor
>>>
>>
>> In Aurora, jobs are restarted if they fail.
>>
>
> Unfortunately, my implementation of the feature is quite crude. It will
> just retry N times and give up if the job continues to fail. Does Aurora
> make any smarter determination about retrying? The job state machine in
> aurora provides a few more states than mine so it might have enough
> information to provide a richer retry framework. I would need to read a bit
> more to figure that out.
>

Aurora has a few 'modes' to handle this.  Anything with service=true is
restarted forever.  Otherwise, maxTaskFailures indicates how many times a
task may fail before we give up on it.


>>
>>>  * Executor warming - min # of executors idle
>>> * Executor limits - max # of executors available
>>>
>>> My plan for integration with mesos is to adapt the job manager into a
>>> mesos scheduler and my execution slaves into a mesos executor. At that
>>> point, my framework will be able to run on the mesos cluster, but I have a
>>> few concerns about how to allocated and release resources that the
>>> executors will use over the lifetime of the cluster. I am not sure whether
>>> it is better to be greedy early on in the frameworks life-cycle or to
>>> decline resources initially and scale the framework's slaves when jobs
>>> start coming in.
>>>
>>
>> The better design would be to use resources as you need them (rather than
>> greedily holding onto offers). Is there any motivation for the greedy
>> approach?
>>
>
> My consideration for the greedy approach was based on what I thought
> aurora was doing internally. It looks like the OfferQueue adds resources
> and then matches those resources to job tasks from the TaskStore as they
> come in. I took that to mean that Aurora would hold on to resources for as
> long as possible. I think I am misunderstanding how often offers will come
> into a scheduler. A sequence diagram for Aurora's resource to job matching
> would be extremely helpful. Its a little tricky to trace control flow
> within Aurora since so many actions are asynchronous. My current belief of
> what happens is this
>
> [Scheduler receives offer] -> [Match a pending task to offer immediately]
> -> [Matched] -> [begin exec]
>
>                                -> [No tasks or not large enough] -> [Add to
> OfferQueue]
>
> [TaskGroup:Monitor] -> [TaskScheduler] -> [Load task from store] ->
> [Attempt to match with OfferQueue] -> [Matched] -> [Exec]
>
>
>   -> [Unmatched] -> [Return control to Monitor] -> [Monitor backs off or
> cancels group]
>
>
> Given these scenarios, I have yet to find a place where Offers will be
> cancelled aside from the Scheudler::offerRescinded callback in the
> framework scheduler. This is what lead me to believe that Aurora is greedy
> about Offers and only gives them up when they are either used to launch a
> task or rescinded. I do agree that a fairer system is to only accept offers
> that you need and cancel the rest. Exactly how that would be implemented I
> will need to think more on.
>

You're correct that Aurora is greedy with offers.  It's my understanding
that mesos will soon be more aggressive about rescinding offers, which *
should* make this behavior more multi-framework-friendly in the future.  We
originally responded to offers synchronously when handling resourceOffers,
but changed to asynchronously accepting offers so we could have
deterministic responsiveness to the callback (since mesos callbacks are
effectively singly-threaded, and holding up resourceOffers stalls important
things like statusUpdate).

Your comprehension of the offer matching in Aurora is correct, though there
is one nuance.  We have two types of tasks — user tasks and system tasks
(e.g. garbage collection to reconcile potential scheduler/slave state
differences).  System tasks are always accepted synchronously, and user
tasks are always accepted asynchronously, via OfferQueue.

We do return offers on two conditions: when we're holding two offers for
the same slave (returned for coalescing), and after a timeout (defaulting
to 5 minutes).


>
>
>>
>>> Additionally, the relationship between the executor and its associated
>>> driver are not immediately clear to me. If I am reading the code correctly,
>>> they do not provide a way to stop a task in progress short of killing the
>>> executor process.
>>>
>>
>> You can kill tasks in progress, the executor process receives the request
>> to kill the task. See SchedulerDriver::killTask and Executor::killTask
>> (scheduler.hpp and executor.hpp).
>>
>>
>
> Ah yes, I missed that. It was very clear once I traced the message flow
> from Scheduler -> Master -> Slave -> Executor. I was probably confusing the
> comment in Executor::launchTask about no other callbacks getting called
> before that one completes with the task actually blocking the callbacks. I
> suppose if you wrote an executor which did not run the task asynchronously
> and launchTask was blocked from returning then you would be in some trouble
> with regard to task control.
>
>
>> I think that mesos will be a nice feature to add to my project and I
>>> would really appreciate any feedback from the community. I will provide
>>> progress updates as I continue work on my experiments.
>>>
>>
>>
>

Re: Aurora, Marathon and long lived job frameworks

Posted by Dan Colish <dc...@urbanairship.com>.
On Fri, Sep 27, 2013 at 5:21 PM, Benjamin Mahler
<be...@gmail.com>wrote:

> I've replied inline below, also cc'ed some of the Aurora / Thermos
> developers to better answer your questions.
>

Thank you very much! I'll respond inline as well.


>
> On Fri, Sep 27, 2013 at 9:00 AM, Dan Colish <dc...@urbanairship.com>wrote:
>
>> I have been working on an internal project for executing a large number
>> of jobs across a cluster for the past couple of months and I am currently
>> doing a spike on using mesos for some of the cluster management tasks. The
>> clear prior art winners are Aurora and Marathon, but in both cases they
>> fall short of what I need.
>>
>> In aurora's case, the software is clearly very early in the open sourcing
>> process and as a result it missing significant pieces. The biggest missing
>> piece is the actual execution framework, Thermos. [That is what I assume
>> thermos does. I have no internal knowledge to verify that assumption]
>> Additionally, Aurora is heavily optimized for a high user count and large
>> number of incoming jobs. My use case is much simpler. There is only one
>> effective user and we have a small known set of jobs which need to run.
>>
>> On the other hand, Marathon is not designed for job execution if job is
>> defined to be a smaller unit of work. Instead, Marathon self-describes as a
>> meta-framework for deploying frameworks to a mesos cluster. A job to
>> marathon is the framework that runs. I do not think Marathon would be a
>> good fit for managing the my task execution and retry logic. It is designed
>> to run at on as a sub-layer of the cluster's resource allocation scheduler
>> and its abstractions follow suit.
>>
>> For my needs Aurora does appear to be a much closer fit than Marathon,
>> but neither is ideal. Since that is the case, I find myself left with a
>> rough choice. I am not thrilled with the prospect of yet another framework
>> for Mesos, but there is a lot of work which I have already completed for my
>> internal project that would need to reworked to fit with Aurora. Currently
>> my project can support the following features.
>>
>> * Distributed job locking - jobs cannot overlap
>>
>
> Can you elaborate on the ways in which jobs cannot overlap? Aurora may
> provide what you need here.
>

My use case requires the guarantee that only one instance of a job can be
run at a given time. In my current implementation, I am using a distributed
pessimistic offline lock to provide this guarantee. My lock implementation
also guarantees modifications are atomic. Looking at Aurora, it follows a
similar approach. Unfortunately, the MemStorarge is not appropriate for
production usage and that is the only Storage implementation in Aurora
which can provide the consistency I need for the locking strategy to work.
It would be possible to include my lock in a persistent Storage
implementation within Aurora, but I suspect that one already exists and
could be made available eventually. If that is not the case, it might be
worth talking more about how my implementation works and if there is any
opportunity for collaboration.


>
>
>> * Job execution delay queue - jobs can be run immediately or after a delay
>> * Job preemption
>>
>
> Aurora has the concept of pre-emption when there are insufficient
> resources for "production" jobs to run. I will defer to Aurora devs on
> elaborating here.
>

Excellent, my job preemption is actually focused on freeing up jobs which
are running too slow or otherwise failing to make progress. It also
provides part of the general job canceling mechanism. I've looked over the
preemption code in Aurora and it also looks very similar.


>
>
>> * Job success/failure tracking
>>
>
> What kind of tracking?
>

I provide an interface for taking actions upon a job's completion. This
callback can by a job designer for any number of uses. Tracking which jobs
succeed and fail is the only use case I have for it now. There were some
future ideas to use it for kicking of dependent jobs or taking other
administrative actions.


>
>
>> * Garbage collection of dead jobs
>>
>
> This is present in Aurora. Eventually, a completed job will be purged from
> the entire system. What kind of garbage collection are you referring to?
>

That is exactly the kind of garbage collection I was speaking of. My system
will remove jobs when they complete and clean up resources such as the
locks taken. It will also preempt and remove jobs which are no longer
making progress.


>
>
>> * Job execution failover - job is retried on a new executor
>>
>
> In Aurora, jobs are restarted if they fail.
>

Unfortunately, my implementation of the feature is quite crude. It will
just retry N times and give up if the job continues to fail. Does Aurora
make any smarter determination about retrying? The job state machine in
aurora provides a few more states than mine so it might have enough
information to provide a richer retry framework. I would need to read a bit
more to figure that out.


>
>
>> * Executor warming - min # of executors idle
>> * Executor limits - max # of executors available
>>
>> My plan for integration with mesos is to adapt the job manager into a
>> mesos scheduler and my execution slaves into a mesos executor. At that
>> point, my framework will be able to run on the mesos cluster, but I have a
>> few concerns about how to allocated and release resources that the
>> executors will use over the lifetime of the cluster. I am not sure whether
>> it is better to be greedy early on in the frameworks life-cycle or to
>> decline resources initially and scale the framework's slaves when jobs
>> start coming in.
>>
>
> The better design would be to use resources as you need them (rather than
> greedily holding onto offers). Is there any motivation for the greedy
> approach?
>

My consideration for the greedy approach was based on what I thought aurora
was doing internally. It looks like the OfferQueue adds resources and then
matches those resources to job tasks from the TaskStore as they come in. I
took that to mean that Aurora would hold on to resources for as long as
possible. I think I am misunderstanding how often offers will come into a
scheduler. A sequence diagram for Aurora's resource to job matching would
be extremely helpful. Its a little tricky to trace control flow within
Aurora since so many actions are asynchronous. My current belief of what
happens is this

[Scheduler receives offer] -> [Match a pending task to offer immediately]
-> [Matched] -> [begin exec]

                             -> [No tasks or not large enough] -> [Add to
OfferQueue]

[TaskGroup:Monitor] -> [TaskScheduler] -> [Load task from store] ->
[Attempt to match with OfferQueue] -> [Matched] -> [Exec]


-> [Unmatched] -> [Return control to Monitor] -> [Monitor backs off or
cancels group]


Given these scenarios, I have yet to find a place where Offers will be
cancelled aside from the Scheudler::offerRescinded callback in the
framework scheduler. This is what lead me to believe that Aurora is greedy
about Offers and only gives them up when they are either used to launch a
task or rescinded. I do agree that a fairer system is to only accept offers
that you need and cancel the rest. Exactly how that would be implemented I
will need to think more on.


>
>> Additionally, the relationship between the executor and its associated
>> driver are not immediately clear to me. If I am reading the code correctly,
>> they do not provide a way to stop a task in progress short of killing the
>> executor process.
>>
>
> You can kill tasks in progress, the executor process receives the request
> to kill the task. See SchedulerDriver::killTask and Executor::killTask
> (scheduler.hpp and executor.hpp).
>
>

Ah yes, I missed that. It was very clear once I traced the message flow
from Scheduler -> Master -> Slave -> Executor. I was probably confusing the
comment in Executor::launchTask about no other callbacks getting called
before that one completes with the task actually blocking the callbacks. I
suppose if you wrote an executor which did not run the task asynchronously
and launchTask was blocked from returning then you would be in some trouble
with regard to task control.


> I think that mesos will be a nice feature to add to my project and I would
>> really appreciate any feedback from the community. I will provide progress
>> updates as I continue work on my experiments.
>>
>
>

Re: Aurora, Marathon and long lived job frameworks

Posted by Benjamin Mahler <be...@gmail.com>.
I've replied inline below, also cc'ed some of the Aurora / Thermos
developers to better answer your questions.

On Fri, Sep 27, 2013 at 9:00 AM, Dan Colish <dc...@urbanairship.com>wrote:

> I have been working on an internal project for executing a large number of
> jobs across a cluster for the past couple of months and I am currently
> doing a spike on using mesos for some of the cluster management tasks. The
> clear prior art winners are Aurora and Marathon, but in both cases they
> fall short of what I need.
>
> In aurora's case, the software is clearly very early in the open sourcing
> process and as a result it missing significant pieces. The biggest missing
> piece is the actual execution framework, Thermos. [That is what I assume
> thermos does. I have no internal knowledge to verify that assumption]
> Additionally, Aurora is heavily optimized for a high user count and large
> number of incoming jobs. My use case is much simpler. There is only one
> effective user and we have a small known set of jobs which need to run.
>
> On the other hand, Marathon is not designed for job execution if job is
> defined to be a smaller unit of work. Instead, Marathon self-describes as a
> meta-framework for deploying frameworks to a mesos cluster. A job to
> marathon is the framework that runs. I do not think Marathon would be a
> good fit for managing the my task execution and retry logic. It is designed
> to run at on as a sub-layer of the cluster's resource allocation scheduler
> and its abstractions follow suit.
>
> For my needs Aurora does appear to be a much closer fit than Marathon, but
> neither is ideal. Since that is the case, I find myself left with a rough
> choice. I am not thrilled with the prospect of yet another framework for
> Mesos, but there is a lot of work which I have already completed for my
> internal project that would need to reworked to fit with Aurora. Currently
> my project can support the following features.
>
> * Distributed job locking - jobs cannot overlap
>

Can you elaborate on the ways in which jobs cannot overlap? Aurora may
provide what you need here.


> * Job execution delay queue - jobs can be run immediately or after a delay
> * Job preemption
>

Aurora has the concept of pre-emption when there are insufficient resources
for "production" jobs to run. I will defer to Aurora devs on elaborating
here.


> * Job success/failure tracking
>

What kind of tracking?


> * Garbage collection of dead jobs
>

This is present in Aurora. Eventually, a completed job will be purged from
the entire system. What kind of garbage collection are you referring to?


> * Job execution failover - job is retried on a new executor
>

In Aurora, jobs are restarted if they fail.


> * Executor warming - min # of executors idle
> * Executor limits - max # of executors available
>
> My plan for integration with mesos is to adapt the job manager into a
> mesos scheduler and my execution slaves into a mesos executor. At that
> point, my framework will be able to run on the mesos cluster, but I have a
> few concerns about how to allocated and release resources that the
> executors will use over the lifetime of the cluster. I am not sure whether
> it is better to be greedy early on in the frameworks life-cycle or to
> decline resources initially and scale the framework's slaves when jobs
> start coming in.
>

The better design would be to use resources as you need them (rather than
greedily holding onto offers). Is there any motivation for the greedy
approach?


> Additionally, the relationship between the executor and its associated
> driver are not immediately clear to me. If I am reading the code correctly,
> they do not provide a way to stop a task in progress short of killing the
> executor process.
>

You can kill tasks in progress, the executor process receives the request
to kill the task. See SchedulerDriver::killTask and Executor::killTask
(scheduler.hpp and executor.hpp).


> I think that mesos will be a nice feature to add to my project and I would
> really appreciate any feedback from the community. I will provide progress
> updates as I continue work on my experiments.
>

Re: Aurora, Marathon and long lived job frameworks

Posted by Sam Taha <ta...@gmail.com>.
While still in active development, I expect JobServer to match some of the
criteria you describe once Mesos integration is complete. It currently
supports these features for static node clusters. With mesos integration,
it will have dynamic clustering capability while still retaining the
enterprise type job scheduling/monitoring/tracking...etc features.

Thanks,
Sam Taha

http://www.grandlogic.com


On Fri, Sep 27, 2013 at 12:59 PM, Dan Colish <dc...@urbanairship.com>wrote:

>
> On Fri, Sep 27, 2013 at 9:04 AM, Damien Hardy <dh...@viadeoteam.com>wrote:
>
>> Hello,
>>
>> What about chronos http://airbnb.github.io/chronos/
>>
>>
> Yes, I evaluated chronos and it was not clear to me how it matches my
> selection criteria. It might be my unfamiliarity with the framework rather
> than a lack of features. Is there anyone who could elaborate more?
>
>
>> Best regards,
>>
>>
>> 2013/9/27 Dan Colish <dc...@urbanairship.com>
>>
>>> I have been working on an internal project for executing a large number
>>> of jobs across a cluster for the past couple of months and I am currently
>>> doing a spike on using mesos for some of the cluster management tasks. The
>>> clear prior art winners are Aurora and Marathon, but in both cases they
>>> fall short of what I need.
>>>
>>> In aurora's case, the software is clearly very early in the open
>>> sourcing process and as a result it missing significant pieces. The biggest
>>> missing piece is the actual execution framework, Thermos. [That is what I
>>> assume thermos does. I have no internal knowledge to verify that
>>> assumption] Additionally, Aurora is heavily optimized for a high user count
>>> and large number of incoming jobs. My use case is much simpler. There is
>>> only one effective user and we have a small known set of jobs which need to
>>> run.
>>>
>>> On the other hand, Marathon is not designed for job execution if job is
>>> defined to be a smaller unit of work. Instead, Marathon self-describes as a
>>> meta-framework for deploying frameworks to a mesos cluster. A job to
>>> marathon is the framework that runs. I do not think Marathon would be a
>>> good fit for managing the my task execution and retry logic. It is designed
>>> to run at on as a sub-layer of the cluster's resource allocation scheduler
>>> and its abstractions follow suit.
>>>
>>> For my needs Aurora does appear to be a much closer fit than Marathon,
>>> but neither is ideal. Since that is the case, I find myself left with a
>>> rough choice. I am not thrilled with the prospect of yet another framework
>>> for Mesos, but there is a lot of work which I have already completed for my
>>> internal project that would need to reworked to fit with Aurora. Currently
>>> my project can support the following features.
>>>
>>> * Distributed job locking - jobs cannot overlap
>>> * Job execution delay queue - jobs can be run immediately or after a
>>> delay
>>> * Job preemption
>>> * Job success/failure tracking
>>> * Garbage collection of dead jobs
>>> * Job execution failover - job is retried on a new executor
>>> * Executor warming - min # of executors idle
>>> * Executor limits - max # of executors available
>>>
>>> My plan for integration with mesos is to adapt the job manager into a
>>> mesos scheduler and my execution slaves into a mesos executor. At that
>>> point, my framework will be able to run on the mesos cluster, but I have a
>>> few concerns about how to allocated and release resources that the
>>> executors will use over the lifetime of the cluster. I am not sure whether
>>> it is better to be greedy early on in the frameworks life-cycle or to
>>> decline resources initially and scale the framework's slaves when jobs
>>> start coming in. Additionally, the relationship between the executor and
>>> its associated driver are not immediately clear to me. If I am reading the
>>> code correctly, they do not provide a way to stop a task in progress short
>>> of killing the executor process.
>>>
>>> I think that mesos will be a nice feature to add to my project and I
>>> would really appreciate any feedback from the community. I will provide
>>> progress updates as I continue work on my experiments.
>>>
>>
>>
>>
>> --
>> Damien HARDY
>>
>
>

Re: Aurora, Marathon and long lived job frameworks

Posted by Dan Colish <dc...@urbanairship.com>.
On Fri, Sep 27, 2013 at 9:04 AM, Damien Hardy <dh...@viadeoteam.com> wrote:

> Hello,
>
> What about chronos http://airbnb.github.io/chronos/
>
>
Yes, I evaluated chronos and it was not clear to me how it matches my
selection criteria. It might be my unfamiliarity with the framework rather
than a lack of features. Is there anyone who could elaborate more?


> Best regards,
>
>
> 2013/9/27 Dan Colish <dc...@urbanairship.com>
>
>> I have been working on an internal project for executing a large number
>> of jobs across a cluster for the past couple of months and I am currently
>> doing a spike on using mesos for some of the cluster management tasks. The
>> clear prior art winners are Aurora and Marathon, but in both cases they
>> fall short of what I need.
>>
>> In aurora's case, the software is clearly very early in the open sourcing
>> process and as a result it missing significant pieces. The biggest missing
>> piece is the actual execution framework, Thermos. [That is what I assume
>> thermos does. I have no internal knowledge to verify that assumption]
>> Additionally, Aurora is heavily optimized for a high user count and large
>> number of incoming jobs. My use case is much simpler. There is only one
>> effective user and we have a small known set of jobs which need to run.
>>
>> On the other hand, Marathon is not designed for job execution if job is
>> defined to be a smaller unit of work. Instead, Marathon self-describes as a
>> meta-framework for deploying frameworks to a mesos cluster. A job to
>> marathon is the framework that runs. I do not think Marathon would be a
>> good fit for managing the my task execution and retry logic. It is designed
>> to run at on as a sub-layer of the cluster's resource allocation scheduler
>> and its abstractions follow suit.
>>
>> For my needs Aurora does appear to be a much closer fit than Marathon,
>> but neither is ideal. Since that is the case, I find myself left with a
>> rough choice. I am not thrilled with the prospect of yet another framework
>> for Mesos, but there is a lot of work which I have already completed for my
>> internal project that would need to reworked to fit with Aurora. Currently
>> my project can support the following features.
>>
>> * Distributed job locking - jobs cannot overlap
>> * Job execution delay queue - jobs can be run immediately or after a delay
>> * Job preemption
>> * Job success/failure tracking
>> * Garbage collection of dead jobs
>> * Job execution failover - job is retried on a new executor
>> * Executor warming - min # of executors idle
>> * Executor limits - max # of executors available
>>
>> My plan for integration with mesos is to adapt the job manager into a
>> mesos scheduler and my execution slaves into a mesos executor. At that
>> point, my framework will be able to run on the mesos cluster, but I have a
>> few concerns about how to allocated and release resources that the
>> executors will use over the lifetime of the cluster. I am not sure whether
>> it is better to be greedy early on in the frameworks life-cycle or to
>> decline resources initially and scale the framework's slaves when jobs
>> start coming in. Additionally, the relationship between the executor and
>> its associated driver are not immediately clear to me. If I am reading the
>> code correctly, they do not provide a way to stop a task in progress short
>> of killing the executor process.
>>
>> I think that mesos will be a nice feature to add to my project and I
>> would really appreciate any feedback from the community. I will provide
>> progress updates as I continue work on my experiments.
>>
>
>
>
> --
> Damien HARDY
>

Re: Aurora, Marathon and long lived job frameworks

Posted by Damien Hardy <dh...@viadeoteam.com>.
Hello,

What about chronos http://airbnb.github.io/chronos/

Best regards,


2013/9/27 Dan Colish <dc...@urbanairship.com>

> I have been working on an internal project for executing a large number of
> jobs across a cluster for the past couple of months and I am currently
> doing a spike on using mesos for some of the cluster management tasks. The
> clear prior art winners are Aurora and Marathon, but in both cases they
> fall short of what I need.
>
> In aurora's case, the software is clearly very early in the open sourcing
> process and as a result it missing significant pieces. The biggest missing
> piece is the actual execution framework, Thermos. [That is what I assume
> thermos does. I have no internal knowledge to verify that assumption]
> Additionally, Aurora is heavily optimized for a high user count and large
> number of incoming jobs. My use case is much simpler. There is only one
> effective user and we have a small known set of jobs which need to run.
>
> On the other hand, Marathon is not designed for job execution if job is
> defined to be a smaller unit of work. Instead, Marathon self-describes as a
> meta-framework for deploying frameworks to a mesos cluster. A job to
> marathon is the framework that runs. I do not think Marathon would be a
> good fit for managing the my task execution and retry logic. It is designed
> to run at on as a sub-layer of the cluster's resource allocation scheduler
> and its abstractions follow suit.
>
> For my needs Aurora does appear to be a much closer fit than Marathon, but
> neither is ideal. Since that is the case, I find myself left with a rough
> choice. I am not thrilled with the prospect of yet another framework for
> Mesos, but there is a lot of work which I have already completed for my
> internal project that would need to reworked to fit with Aurora. Currently
> my project can support the following features.
>
> * Distributed job locking - jobs cannot overlap
> * Job execution delay queue - jobs can be run immediately or after a delay
> * Job preemption
> * Job success/failure tracking
> * Garbage collection of dead jobs
> * Job execution failover - job is retried on a new executor
> * Executor warming - min # of executors idle
> * Executor limits - max # of executors available
>
> My plan for integration with mesos is to adapt the job manager into a
> mesos scheduler and my execution slaves into a mesos executor. At that
> point, my framework will be able to run on the mesos cluster, but I have a
> few concerns about how to allocated and release resources that the
> executors will use over the lifetime of the cluster. I am not sure whether
> it is better to be greedy early on in the frameworks life-cycle or to
> decline resources initially and scale the framework's slaves when jobs
> start coming in. Additionally, the relationship between the executor and
> its associated driver are not immediately clear to me. If I am reading the
> code correctly, they do not provide a way to stop a task in progress short
> of killing the executor process.
>
> I think that mesos will be a nice feature to add to my project and I would
> really appreciate any feedback from the community. I will provide progress
> updates as I continue work on my experiments.
>



-- 
Damien HARDY