You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airavata.apache.org by "Shenoy, Gourav Ganesh" <go...@indiana.edu> on 2016/10/08 02:56:46 UTC

Re: Mesos based meta-scheduling for Airavata

Hi dev,

I have been exploring different frameworks for Mesos which would help our use-case of providing Airavata the capability to run jobs in a Mesos based ecosystem. In particular, I have been playing around with Marathon & Chronos and I am now going to be working on Apache Aurora.

I have summarized my understanding about Mesos, Marathon & Chronos below. I will send out a separate email about Aurora later.

Apache Mesos:


·         Apache Mesos is an open-source cluster manager, in the sense that it helps deploy & manage different frameworks (or applications) in a large clustered environment easily.

·         Mesos provides the ability to utilize underlying shared pool of nodes as a single compute unit – That is, it can run many applications on these nodes efficiently.

·         Mesos uses the concept of “offers” for scheduling and running jobs on the underlying nodes. When a framework (application) wants to run computations/jobs on the cluster, Mesos will decide how many resources it will “offer” that framework based on the availability. The framework will then decide which resources to use from the offer, and subsequently run the computation/job on that resource.

·         In a typical cluster, you will have 3 or more Mesos masters & multiple Mesos slaves. Multiple mesos masters help in providing high availability – if one master goes down, Mesos will reelect a new leader (master) – using Zookeeper.

·         The task mentioned above of providing “offers” to frameworks is done by a master, whereas the slaves are the ones who run these computations.


·         Some additional points:

o    I built a Mesos cluster with 3 masters & 2 slaves on EC2.

o    Each master & slave have 1GB of RAM & 1vCPU with 20GB of disk space.

Marathon:


·         Marathon is considered a framework that runs on top of Mesos. It is a container orchestration platform for Mesos and essentially acts as a service scheduler.

·         It is named “marathon” because it is intended for long running applications. That is, Marathon makes sure that the service it is running never stops – if a service goes down or the slave on which the service is run dies, marathon keeps re-starting it on different slaves.

·         In some sense Marathon is very good for ensuring high availability of services. That is, instead of running services directly on Mesos, run it in Marathon if you never want it to die.
Note: You can decide to run a service on multiple slave nodes and if resources on these slaves are available, Mesos will “offer” them to Marathon.

·         It is called a container orchestration platform because it “launches” these services inside a container – either Docker OR Mesos container.

·         In my opinion it is not a suitable “job scheduler” for Airavata because in Airavata we need to run a job and get the output rather than keeping it running always. Instead, we can run other schedulers – chronos/aurora as a service in Marathon.

Chronos:


·         Chronos is a Cron scheduler for Mesos. It is good for running scheduled jobs – jobs that need to be run for a certain number of times, repeatedly after certain intervals.

·         Chronos also provides the ability to add dependencies between jobs – That is, if a job1 is dependent on another job2 then it will run job1 first and then run job2 after job1 completes. It also builds a Directed Acyclic Graph (DAG) based on these dependencies.

·         Similar to Marathon, Chronos receives “offers” from Mesos master whenever it needs to run a job on Mesos.

·         Again, I found that Chronos does not fit the Airavata use-case since I could not find a way to run one-off jobs via Chronos – you need to specify interval time for Chronos, & Chronos then re-runs the job after that interval is complete (even if you decide to specify num. of repetitions=1).


Some additional points:

·         Marathon & Chronos both have REST API support – eg: you can submit jobs via APIs along with other interactions such as list jobs, etc.

·         I installed Marathon & Chronos frameworks on the Mesos master nodes. This is how their health looks like on the Mesos dashboard:

[cid:image002.png@01D220EE.1346FB90]
                As you can see, there are 3 active tasks running in Chronos & 4 active tasks (long running) in Marathon.


·         I also installed Chronos as a service inside Marathon, and this is how it looks like in the Marathon UI:


[cid:image004.png@01D220EE.1346FB90]
Interestingly, Chronos (as a service in Marathon) was smart enough to identify the jobs submitted via Chronos (as a framework on Mesos) & vice-versa.


·         Also, Mesos dashboard lists the active tasks it is running & details about which slave the task is running on. It also lists Completed tasks. The “Sandbox” gives you access to the stdout/stderr files for the tasks as well as any other directories that were created as part of the task.

[cid:image005.png@01D220EE.1346FB90]

Pardon me for this long email. Next, I will explore Apache Aurora which seems a better fit for Airavata use-case because it provides the features that Chronos supports, as well as can run one-off jobs.

Thanks and Regards,
Gourav Shenoy

From: "Shenoy, Gourav Ganesh" <go...@indiana.edu>
Reply-To: "dev@airavata.apache.org" <de...@airavata.apache.org>
Date: Friday, September 23, 2016 at 4:43 PM
To: "dev@airavata.apache.org" <de...@airavata.apache.org>
Subject: Mesos based meta-scheduling for Airavata

Hi Dev,

I am working on this project of building a Mesos based meta-scheduler for Airavata, along with Shameera & Mangirish. Here is the jira link: https://issues.apache.org/jira/browse/AIRAVATA-2082.


·         We have identified some tasks that would be needed for achieving this, and at the higher level it would consist of:

1.      Resource provisioning – We need to provision resources on cloud & hpc infrastructures such as EC2, Jetstream, Comet, etc.

2.      Building a cluster – Deploying a Mesos cluster on set of nodes obtained from (1) above for task management.

3.      Selecting a scheduler – We need to investigate the scheduler to use with Mesos cluster. Some of the options are Marathon, Aurora. But we need to find one that suits our needs of running serial as well as parallel (MPI) jobs.

4.      Installing & running applications on this cluster – Once the cluster has been deployed and a scheduler choice made, we need to be able to install and run applications on this cluster using Airavata.


·         Until now we were able to look into the following:

o   Resource provisioning:

§  We explored several options of provisioning resources – using cloud libraries as well as via ansible scripts.

§  We built a OpenStack4J Java module which would provision instances on OpenStack based clouds (eg: Jetstream).

§  We also built a CloudBridge Python module for provisioning EC2 instances on Amazon. CloudBridge can also be used to provision instances on OpenStack

§  We wrote Ansible scripts for bringing up instances on both AWS and OpenStack based clouds.


§  Key Points: CloudBridge, OpenStack4J are powerful libraries for resource provisioning, but currently they do single-instance provisioning, and not support templated boot options such as CloudFormation (for AWS) & Heat (for OpenStack).


o   Building a cluster:

§  We wrote Ansible script for deploying a Mesos-Marathon cluster on a set of nodes. This script will install necessary dependencies such as Zookeeper.

§  We tested this on OpenStack based clouds & on EC2.

§  OpenStack Magnum provides excellent support for doing resource provisioning & deploying mesos cluster, but we are running into some problems while trying it.


o   Installing a scheduler:

§  Our Ansible script is currently installing Marathon as the scheduler on Mesos. We haven’t yet submitted jobs using Marathon.


·         Although not finalized, but we are inclined towards using Ansible approach for the above, as Ansible also provides Python APIs and which will allow us to integrate it with Airavata via Thrift. Hence we will be able to easily invoke the Ansible scripts from code without needing to use the command-line interface.


·         We are also progressively working on some work-items such as:

o   Exploring options to provision and deploy a Mesos-Marathon cluster on HPC systems such as Comet. The challenge would be to use Ansible to provision resources and deploy the cluster. Once we have a cluster, we can try running applications.

o   Exploring different scheduler options for running serial and parallel (MPI) jobs on such heterogeneous clusters.

o   Exploring orchestration options such as OpenStack Heat, AWS CloudFormation, OpenStack Magnum, etc.

Any suggestions and comments are highly appreciated.

Thanks and Regards,
Gourav Shenoy




Re: Mesos based meta-scheduling for Airavata

Posted by Renan DelValle <rd...@binghamton.edu>.
I haven't gotten to do that unfortunately. It's on my to-do list for my own
client.

Either way, I think you might get better info if you ask on one of the
Aurora mailing lists.

-Renan

On Thu, Oct 27, 2016 at 5:36 PM, Shenoy, Gourav Ganesh <goshenoy@indiana.edu
> wrote:

> *@Renan*,
>
>
>
> I had a question – what is the default thrift port for aurora scheduler,
> which uses TBinaryProtocol?
>
>
>
> I have installed Aurora-0.16 scheduler/executor on the Mesos-1.0.1
> cluster, and only been able to use the THttpClient over TJSONProtocol (port
> 8081). Aurora site mentions that they have enabled TBinaryProtocol for 0.16
> version, but somehow I am not able to find the binary port. It would be
> great if you could provide some guidance here.
>
>
>
> Thanks and Regards,
>
> Gourav Shenoy
>
>
>
> *From: *Renan DelValle <rd...@binghamton.edu>
> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
> *Date: *Thursday, October 27, 2016 at 4:31 PM
> *To: *Suresh Marru <sm...@apache.org>
> *Cc: *Airavata Dev <de...@airavata.apache.org>, Madhusudhan Govindaraju <
> mgovinda@binghamton.edu>
>
> *Subject: *Re: Mesos based meta-scheduling for Airavata
>
>
>
> I wish I had the bandwidth to help with this. I'll do my best to answer
> any pointed questions (if there are any) on the Aurora irc/slack chat.
>
> -Renan
>
>
>
> On Oct 17, 2016 11:38 PM, "Suresh Marru" <sm...@apache.org> wrote:
>
> Hi Renan,
>
>
>
> Since you did a similar exercise using Go [1], it will be nice to see your
> feedback and guidance on the discussions Gourav is summarizing below.
>
>
>
> Suresh
>
>
>
> [1] - http://markmail.org/thread/ymj7yqvvbhrjwv3s
>
>
>
> On Oct 17, 2016, at 11:32 PM, Shenoy, Gourav Ganesh <go...@indiana.edu>
> wrote:
>
>
>
> Hi dev,
>
>
>
> Now that I have been able to get jobs scheduled via Aurora, I thought I
> should summarize my understanding. I would also like to briefly draw out
> the plan which I am working on with respect to using Mesos with Airavata.
>
>
>
> *Apache Aurora:*
>
>
>
> ·         Aurora, similar to Marathon & Chronos, is a service scheduler
> framework for Mesos. It has been built for scheduling long running services
> & cron jobs on Mesos.
>
> ·         The advantage with Aurora (over Marathon & Chronos) is that it
> works well for one-off jobs as well – i.e. If I want to run a job and get
> the output, Aurora is a better fit than Marathon & Chronos, since Marathon
> will never let the job exit (and keep restarting it on slaves) & Chronos is
> ONLY for crons.
>
> ·         Aurora also allows fine grained control of the jobs that need
> to be submitted – the concept of jobs, tasks, processes – a job can consist
> of one or more tasks, and a task can consist of one or more processes.
>
> ·         Aurora manages jobs that are made up of tasks; Mesos manages
> the tasks that consist of processes; Thermos (is the Aurora executor)
> manages the processes.
>
> ·         We can control resource utilization at task level because of
> the above job abstractions that Aurora provides.
>
> ·         Among many other features, a useful one is the resource-quota
> management for users & the ability to support multiple users to run jobs.
>
>
>
> *Current focus:*
>
>
>
> ·         I am currently working on building a Thrift based client for
> Aurora, and have been successful in implementing one, but with limited
> operations.
>
> ·         I will be adding support for more operations keeping them
> aligned to Airavata job submission/monitoring requirements.
>
> ·         I am currently focusing on targeting Airavata deployment to
> Mesos on a single cluster (eg: AWS). The flow would look like follows:
>
> <image001.png>
>
> ·         As you can see, currently there is just a single Mesos cluster.
> The future focus would be to expand this to have multiple clusters.
>
>
>
> *Subsequent work:*
>
> ·         Once we are able to test Airavata deployment to single cluster
> successfully, we can expand this to a multi-cluster environment.
>
> ·         Here we would multiple Mesos clusters which would somehow need
> to be managed. But, the overall flow would look like follows:
>
> <image002.png>
>
>
>
> ·         We can either have multiple Mesos masters (for each individual
> cluster), that are connected to each other via VPN, or have a single master
> – in which case we would need to consider all other nodes as slaves.
>
> ·         This is a design issue which needs discussion, and Suresh has
> some ideas on how to do this.
>
>
>
> Thanks and Regards,
>
> Gourav Shenoy
>
>
>
> *From: *Suresh Marru <sm...@apache.org>
> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
> *Date: *Friday, October 7, 2016 at 11:43 PM
> *To: *Airavata Dev <de...@airavata.apache.org>
> *Subject: *Re: Mesos based meta-scheduling for Airavata
>
>
>
> Hi Gourav,
>
>
>
> Thank you for the nice informative summaries, posts like these are always
> educational. Keep’em coming.
>
>
>
> Suresh
>
>
>
> On Oct 7, 2016, at 10:56 PM, Shenoy, Gourav Ganesh <go...@indiana.edu>
> wrote:
>
>
>
> Hi dev,
>
>
>
> I have been exploring different frameworks for Mesos which would help our
> use-case of providing Airavata the capability to run jobs in a Mesos based
> ecosystem. In particular, I have been playing around with Marathon &
> Chronos and I am now going to be working on Apache Aurora.
>
>
>
> I have summarized my understanding about Mesos, Marathon & Chronos below.
> I will send out a separate email about Aurora later.
>
>
>
> *Apache Mesos:*
>
>
>
> ·         Apache Mesos is an open-source cluster manager, in the sense
> that it helps deploy & manage different frameworks (or applications) in a
> large clustered environment easily.
>
> ·         Mesos provides the ability to utilize underlying shared pool of
> nodes as a single compute unit – That is, it can run many applications on
> these nodes efficiently.
>
> ·         Mesos uses the concept of “offers” for scheduling and running
> jobs on the underlying nodes. When a framework (application) wants to run
> computations/jobs on the cluster, Mesos will decide how many resources it
> will “offer” that framework based on the availability. The framework will
> then decide which resources to use from the offer, and subsequently run the
> computation/job on that resource.
>
> ·         In a typical cluster, you will have 3 or more Mesos masters &
> multiple Mesos slaves. Multiple mesos masters help in providing high
> availability – if one master goes down, Mesos will reelect a new leader
> (master) – using Zookeeper.
>
> ·         The task mentioned above of providing “offers” to frameworks is
> done by a master, whereas the slaves are the ones who run these
> computations.
>
>
>
> ·         Some additional points:
>
> o    I built a Mesos cluster with 3 masters & 2 slaves on EC2.
>
> o    Each master & slave have 1GB of RAM & 1vCPU with 20GB of disk space.
>
>
>
> *Marathon:*
>
>
>
> ·         Marathon is considered a framework that runs on top of Mesos.
> It is a container orchestration platform for Mesos and essentially acts as
> a service scheduler.
>
> ·         It is named “marathon” because it is intended for long running
> applications. That is, Marathon makes sure that the service it is running
> never stops – if a service goes down or the slave on which the service is
> run dies, marathon keeps re-starting it on different slaves.
>
> ·         In some sense Marathon is very good for ensuring high
> availability of services. That is, instead of running services directly on
> Mesos, run it in Marathon if you never want it to die.
> *Note*: You can decide to run a service on multiple slave nodes and if
> resources on these slaves are available, Mesos will “offer” them to
> Marathon.
>
> ·         It is called a container orchestration platform because it
> “launches” these services inside a container – either Docker OR Mesos
> container.
>
> ·         In my opinion it is not a suitable “job scheduler” for Airavata
> because in Airavata we need to run a job and get the output rather than
> keeping it running always. Instead, we can run other schedulers –
> chronos/aurora as a service in Marathon.
>
> *Chronos:*
>
>
>
> ·         Chronos is a Cron scheduler for Mesos. It is good for running
> scheduled jobs – jobs that need to be run for a certain number of times,
> repeatedly after certain intervals.
>
> ·         Chronos also provides the ability to add dependencies between
> jobs – That is, if a job1 is dependent on another job2 then it will run
> job1 first and then run job2 after job1 completes. It also builds a
> Directed Acyclic Graph (DAG) based on these dependencies.
>
> ·         Similar to Marathon, Chronos receives “offers” from Mesos
> master whenever it needs to run a job on Mesos.
>
> ·         Again, I found that Chronos does not fit the Airavata use-case
> since I could not find a way to run one-off jobs via Chronos – you need to
> specify interval time for Chronos, & Chronos then re-runs the job after
> that interval is complete (even if you decide to specify num. of
> repetitions=1).
>
>
>
>
>
> Some additional points:
>
> ·         Marathon & Chronos both have REST API support – eg: you can
> submit jobs via APIs along with other interactions such as list jobs, etc.
>
> ·         I installed Marathon & Chronos frameworks on the Mesos master
> nodes. This is how their health looks like on the Mesos dashboard:
>
>
>
> <image002.png>
>
>                 As you can see, there are 3 active tasks running in
> Chronos & 4 active tasks (long running) in Marathon.
>
>
>
> ·         I also installed Chronos as a service inside Marathon, and this
> is how it looks like in the Marathon UI:
>
> <image004.png>
>
> Interestingly, Chronos (as a service in Marathon) was smart enough to
> identify the jobs submitted via Chronos (as a framework on Mesos) &
> vice-versa.
>
>
>
> ·         Also, Mesos dashboard lists the active tasks it is running &
> details about which slave the task is running on. It also lists Completed
> tasks. The “Sandbox” gives you access to the stdout/stderr files for the
> tasks as well as any other directories that were created as part of the
> task.
>
> <image005.png>
>
>
>
> Pardon me for this long email. Next, I will explore Apache Aurora which
> seems a better fit for Airavata use-case because it provides the features
> that Chronos supports, as well as can run one-off jobs.
>
>
>
> Thanks and Regards,
>
> Gourav Shenoy
>
>
>
> *From: *"Shenoy, Gourav Ganesh" <go...@indiana.edu>
> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
> *Date: *Friday, September 23, 2016 at 4:43 PM
> *To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
> *Subject: *Mesos based meta-scheduling for Airavata
>
>
>
> Hi Dev,
>
>
>
> I am working on this project of building a Mesos based meta-scheduler for
> Airavata, along with Shameera & Mangirish. Here is the jira link:
> https://issues.apache.org/jira/browse/AIRAVATA-2082.
>
>
>
> ·         We have identified some tasks that would be needed for
> achieving this, and at the higher level it would consist of:
>
> 1.      Resource provisioning – We need to provision resources on cloud &
> hpc infrastructures such as EC2, Jetstream, Comet, etc.
>
> 2.      Building a cluster – Deploying a Mesos cluster on set of nodes
> obtained from (1) above for task management.
>
> 3.      Selecting a scheduler – We need to investigate the scheduler to
> use with Mesos cluster. Some of the options are Marathon, Aurora. But we
> need to find one that suits our needs of running serial as well as parallel
> (MPI) jobs.
>
> 4.      Installing & running applications on this cluster – Once the
> cluster has been deployed and a scheduler choice made, we need to be able
> to install and run applications on this cluster using Airavata.
>
>
>
> ·         Until now we were able to look into the following:
>
> o   Resource provisioning:
>
> §  We explored several options of provisioning resources – using cloud
> libraries as well as via ansible scripts.
>
> §  We built a OpenStack4J Java module which would provision instances on
> OpenStack based clouds (eg: Jetstream).
>
> §  We also built a CloudBridge Python module for provisioning EC2
> instances on Amazon. CloudBridge can also be used to provision instances on
> OpenStack
>
> §  We wrote Ansible scripts for bringing up instances on both AWS and
> OpenStack based clouds.
>
>
>
> §  *Key Points*: CloudBridge, OpenStack4J are powerful libraries for
> resource provisioning, but currently they do single-instance provisioning,
> and not support templated boot options such as CloudFormation (for AWS) &
> Heat (for OpenStack).
>
>
>
> o   Building a cluster:
>
> §  We wrote Ansible script for deploying a Mesos-Marathon cluster on a
> set of nodes. This script will install necessary dependencies such as
> Zookeeper.
>
> §  We tested this on OpenStack based clouds & on EC2.
>
> §  OpenStack Magnum provides excellent support for doing resource
> provisioning & deploying mesos cluster, but we are running into some
> problems while trying it.
>
>
>
> o   Installing a scheduler:
>
> §  Our Ansible script is currently installing Marathon as the scheduler
> on Mesos. We haven’t yet submitted jobs using Marathon.
>
>
>
> ·         Although not finalized, but we are inclined towards using
> Ansible approach for the above, as Ansible also provides Python APIs and
> which will allow us to integrate it with Airavata via Thrift. Hence we will
> be able to easily invoke the Ansible scripts from code without needing to
> use the command-line interface.
>
>
>
> ·         We are also progressively working on some work-items such as:
>
> o   Exploring options to provision and deploy a Mesos-Marathon cluster on
> HPC systems such as Comet. The challenge would be to use Ansible to
> provision resources and deploy the cluster. Once we have a cluster, we can
> try running applications.
>
> o   Exploring different scheduler options for running serial and parallel
> (MPI) jobs on such heterogeneous clusters.
>
> o   Exploring orchestration options such as OpenStack Heat, AWS
> CloudFormation, OpenStack Magnum, etc.
>
>
>
> Any suggestions and comments are highly appreciated.
>
>
>
> Thanks and Regards,
>
> Gourav Shenoy
>
>
>
>
>
>

Re: Mesos based meta-scheduling for Airavata

Posted by "Shenoy, Gourav Ganesh" <go...@indiana.edu>.
@Renan,

I had a question – what is the default thrift port for aurora scheduler, which uses TBinaryProtocol?

I have installed Aurora-0.16 scheduler/executor on the Mesos-1.0.1 cluster, and only been able to use the THttpClient over TJSONProtocol (port 8081). Aurora site mentions that they have enabled TBinaryProtocol for 0.16 version, but somehow I am not able to find the binary port. It would be great if you could provide some guidance here.

Thanks and Regards,
Gourav Shenoy

From: Renan DelValle <rd...@binghamton.edu>
Reply-To: "dev@airavata.apache.org" <de...@airavata.apache.org>
Date: Thursday, October 27, 2016 at 4:31 PM
To: Suresh Marru <sm...@apache.org>
Cc: Airavata Dev <de...@airavata.apache.org>, Madhusudhan Govindaraju <mg...@binghamton.edu>
Subject: Re: Mesos based meta-scheduling for Airavata


I wish I had the bandwidth to help with this. I'll do my best to answer any pointed questions (if there are any) on the Aurora irc/slack chat.

-Renan

On Oct 17, 2016 11:38 PM, "Suresh Marru" <sm...@apache.org>> wrote:
Hi Renan,

Since you did a similar exercise using Go [1], it will be nice to see your feedback and guidance on the discussions Gourav is summarizing below.

Suresh

[1] - http://markmail.org/thread/ymj7yqvvbhrjwv3s

On Oct 17, 2016, at 11:32 PM, Shenoy, Gourav Ganesh <go...@indiana.edu>> wrote:

Hi dev,

Now that I have been able to get jobs scheduled via Aurora, I thought I should summarize my understanding. I would also like to briefly draw out the plan which I am working on with respect to using Mesos with Airavata.

Apache Aurora:

•         Aurora, similar to Marathon & Chronos, is a service scheduler framework for Mesos. It has been built for scheduling long running services & cron jobs on Mesos.
•         The advantage with Aurora (over Marathon & Chronos) is that it works well for one-off jobs as well – i.e. If I want to run a job and get the output, Aurora is a better fit than Marathon & Chronos, since Marathon will never let the job exit (and keep restarting it on slaves) & Chronos is ONLY for crons.
•         Aurora also allows fine grained control of the jobs that need to be submitted – the concept of jobs, tasks, processes – a job can consist of one or more tasks, and a task can consist of one or more processes.
•         Aurora manages jobs that are made up of tasks; Mesos manages the tasks that consist of processes; Thermos (is the Aurora executor) manages the processes.
•         We can control resource utilization at task level because of the above job abstractions that Aurora provides.
•         Among many other features, a useful one is the resource-quota management for users & the ability to support multiple users to run jobs.

Current focus:

•         I am currently working on building a Thrift based client for Aurora, and have been successful in implementing one, but with limited operations.
•         I will be adding support for more operations keeping them aligned to Airavata job submission/monitoring requirements.
•         I am currently focusing on targeting Airavata deployment to Mesos on a single cluster (eg: AWS). The flow would look like follows:
<image001.png>
•         As you can see, currently there is just a single Mesos cluster. The future focus would be to expand this to have multiple clusters.

Subsequent work:
•         Once we are able to test Airavata deployment to single cluster successfully, we can expand this to a multi-cluster environment.
•         Here we would multiple Mesos clusters which would somehow need to be managed. But, the overall flow would look like follows:
<image002.png>

•         We can either have multiple Mesos masters (for each individual cluster), that are connected to each other via VPN, or have a single master – in which case we would need to consider all other nodes as slaves.
•         This is a design issue which needs discussion, and Suresh has some ideas on how to do this.

Thanks and Regards,
Gourav Shenoy

From: Suresh Marru <sm...@apache.org>>
Reply-To: "dev@airavata.apache.org<ma...@airavata.apache.org>" <de...@airavata.apache.org>>
Date: Friday, October 7, 2016 at 11:43 PM
To: Airavata Dev <de...@airavata.apache.org>>
Subject: Re: Mesos based meta-scheduling for Airavata

Hi Gourav,

Thank you for the nice informative summaries, posts like these are always educational. Keep’em coming.

Suresh

On Oct 7, 2016, at 10:56 PM, Shenoy, Gourav Ganesh <go...@indiana.edu>> wrote:

Hi dev,

I have been exploring different frameworks for Mesos which would help our use-case of providing Airavata the capability to run jobs in a Mesos based ecosystem. In particular, I have been playing around with Marathon & Chronos and I am now going to be working on Apache Aurora.

I have summarized my understanding about Mesos, Marathon & Chronos below. I will send out a separate email about Aurora later.

Apache Mesos:

•         Apache Mesos is an open-source cluster manager, in the sense that it helps deploy & manage different frameworks (or applications) in a large clustered environment easily.
•         Mesos provides the ability to utilize underlying shared pool of nodes as a single compute unit – That is, it can run many applications on these nodes efficiently.
•         Mesos uses the concept of “offers” for scheduling and running jobs on the underlying nodes. When a framework (application) wants to run computations/jobs on the cluster, Mesos will decide how many resources it will “offer” that framework based on the availability. The framework will then decide which resources to use from the offer, and subsequently run the computation/job on that resource.
•         In a typical cluster, you will have 3 or more Mesos masters & multiple Mesos slaves. Multiple mesos masters help in providing high availability – if one master goes down, Mesos will reelect a new leader (master) – using Zookeeper.
•         The task mentioned above of providing “offers” to frameworks is done by a master, whereas the slaves are the ones who run these computations.

•         Some additional points:
o    I built a Mesos cluster with 3 masters & 2 slaves on EC2.
o    Each master & slave have 1GB of RAM & 1vCPU with 20GB of disk space.

Marathon:

•         Marathon is considered a framework that runs on top of Mesos. It is a container orchestration platform for Mesos and essentially acts as a service scheduler.
•         It is named “marathon” because it is intended for long running applications. That is, Marathon makes sure that the service it is running never stops – if a service goes down or the slave on which the service is run dies, marathon keeps re-starting it on different slaves.
•         In some sense Marathon is very good for ensuring high availability of services. That is, instead of running services directly on Mesos, run it in Marathon if you never want it to die.
Note: You can decide to run a service on multiple slave nodes and if resources on these slaves are available, Mesos will “offer” them to Marathon.
•         It is called a container orchestration platform because it “launches” these services inside a container – either Docker OR Mesos container.
•         In my opinion it is not a suitable “job scheduler” for Airavata because in Airavata we need to run a job and get the output rather than keeping it running always. Instead, we can run other schedulers – chronos/aurora as a service in Marathon.

Chronos:

•         Chronos is a Cron scheduler for Mesos. It is good for running scheduled jobs – jobs that need to be run for a certain number of times, repeatedly after certain intervals.
•         Chronos also provides the ability to add dependencies between jobs – That is, if a job1 is dependent on another job2 then it will run job1 first and then run job2 after job1 completes. It also builds a Directed Acyclic Graph (DAG) based on these dependencies.
•         Similar to Marathon, Chronos receives “offers” from Mesos master whenever it needs to run a job on Mesos.
•         Again, I found that Chronos does not fit the Airavata use-case since I could not find a way to run one-off jobs via Chronos – you need to specify interval time for Chronos, & Chronos then re-runs the job after that interval is complete (even if you decide to specify num. of repetitions=1).


Some additional points:
•         Marathon & Chronos both have REST API support – eg: you can submit jobs via APIs along with other interactions such as list jobs, etc.
•         I installed Marathon & Chronos frameworks on the Mesos master nodes. This is how their health looks like on the Mesos dashboard:

<image002.png>
                As you can see, there are 3 active tasks running in Chronos & 4 active tasks (long running) in Marathon.

•         I also installed Chronos as a service inside Marathon, and this is how it looks like in the Marathon UI:

<image004.png>
Interestingly, Chronos (as a service in Marathon) was smart enough to identify the jobs submitted via Chronos (as a framework on Mesos) & vice-versa.

•         Also, Mesos dashboard lists the active tasks it is running & details about which slave the task is running on. It also lists Completed tasks. The “Sandbox” gives you access to the stdout/stderr files for the tasks as well as any other directories that were created as part of the task.

<image005.png>

Pardon me for this long email. Next, I will explore Apache Aurora which seems a better fit for Airavata use-case because it provides the features that Chronos supports, as well as can run one-off jobs.

Thanks and Regards,
Gourav Shenoy

From: "Shenoy, Gourav Ganesh" <go...@indiana.edu>>
Reply-To: "dev@airavata.apache.org<ma...@airavata.apache.org>" <de...@airavata.apache.org>>
Date: Friday, September 23, 2016 at 4:43 PM
To: "dev@airavata.apache.org<ma...@airavata.apache.org>" <de...@airavata.apache.org>>
Subject: Mesos based meta-scheduling for Airavata

Hi Dev,

I am working on this project of building a Mesos based meta-scheduler for Airavata, along with Shameera & Mangirish. Here is the jira link:https://issues.apache.org/jira/browse/AIRAVATA-2082.

•         We have identified some tasks that would be needed for achieving this, and at the higher level it would consist of:
1.      Resource provisioning – We need to provision resources on cloud & hpc infrastructures such as EC2, Jetstream, Comet, etc.
2.      Building a cluster – Deploying a Mesos cluster on set of nodes obtained from (1) above for task management.
3.      Selecting a scheduler – We need to investigate the scheduler to use with Mesos cluster. Some of the options are Marathon, Aurora. But we need to find one that suits our needs of running serial as well as parallel (MPI) jobs.
4.      Installing & running applications on this cluster – Once the cluster has been deployed and a scheduler choice made, we need to be able to install and run applications on this cluster using Airavata.

•         Until now we were able to look into the following:
o   Resource provisioning:
•  We explored several options of provisioning resources – using cloud libraries as well as via ansible scripts.
•  We built a OpenStack4J Java module which would provision instances on OpenStack based clouds (eg: Jetstream).
•  We also built a CloudBridge Python module for provisioning EC2 instances on Amazon. CloudBridge can also be used to provision instances on OpenStack
•  We wrote Ansible scripts for bringing up instances on both AWS and OpenStack based clouds.

•  Key Points: CloudBridge, OpenStack4J are powerful libraries for resource provisioning, but currently they do single-instance provisioning, and not support templated boot options such as CloudFormation (for AWS) & Heat (for OpenStack).

o   Building a cluster:
•  We wrote Ansible script for deploying a Mesos-Marathon cluster on a set of nodes. This script will install necessary dependencies such as Zookeeper.
•  We tested this on OpenStack based clouds & on EC2.
•  OpenStack Magnum provides excellent support for doing resource provisioning & deploying mesos cluster, but we are running into some problems while trying it.

o   Installing a scheduler:
•  Our Ansible script is currently installing Marathon as the scheduler on Mesos. We haven’t yet submitted jobs using Marathon.

•         Although not finalized, but we are inclined towards using Ansible approach for the above, as Ansible also provides Python APIs and which will allow us to integrate it with Airavata via Thrift. Hence we will be able to easily invoke the Ansible scripts from code without needing to use the command-line interface.

•         We are also progressively working on some work-items such as:
o   Exploring options to provision and deploy a Mesos-Marathon cluster on HPC systems such as Comet. The challenge would be to use Ansible to provision resources and deploy the cluster. Once we have a cluster, we can try running applications.
o   Exploring different scheduler options for running serial and parallel (MPI) jobs on such heterogeneous clusters.
o   Exploring orchestration options such as OpenStack Heat, AWS CloudFormation, OpenStack Magnum, etc.

Any suggestions and comments are highly appreciated.

Thanks and Regards,
Gourav Shenoy



Re: Mesos based meta-scheduling for Airavata

Posted by Renan DelValle <rd...@binghamton.edu>.
I wish I had the bandwidth to help with this. I'll do my best to answer any
pointed questions (if there are any) on the Aurora irc/slack chat.

-Renan

On Oct 17, 2016 11:38 PM, "Suresh Marru" <sm...@apache.org> wrote:

> Hi Renan,
>
> Since you did a similar exercise using Go [1], it will be nice to see your
> feedback and guidance on the discussions Gourav is summarizing below.
>
> Suresh
>
> [1] - http://markmail.org/thread/ymj7yqvvbhrjwv3s
>
> On Oct 17, 2016, at 11:32 PM, Shenoy, Gourav Ganesh <go...@indiana.edu>
> wrote:
>
> Hi dev,
>
> Now that I have been able to get jobs scheduled via Aurora, I thought I
> should summarize my understanding. I would also like to briefly draw out
> the plan which I am working on with respect to using Mesos with Airavata.
>
> *Apache Aurora:*
>
> ·         Aurora, similar to Marathon & Chronos, is a service scheduler
> framework for Mesos. It has been built for scheduling long running services
> & cron jobs on Mesos.
> ·         The advantage with Aurora (over Marathon & Chronos) is that it
> works well for one-off jobs as well – i.e. If I want to run a job and get
> the output, Aurora is a better fit than Marathon & Chronos, since Marathon
> will never let the job exit (and keep restarting it on slaves) & Chronos is
> ONLY for crons.
> ·         Aurora also allows fine grained control of the jobs that need
> to be submitted – the concept of jobs, tasks, processes – a job can consist
> of one or more tasks, and a task can consist of one or more processes.
> ·         Aurora manages jobs that are made up of tasks; Mesos manages
> the tasks that consist of processes; Thermos (is the Aurora executor)
> manages the processes.
> ·         We can control resource utilization at task level because of
> the above job abstractions that Aurora provides.
> ·         Among many other features, a useful one is the resource-quota
> management for users & the ability to support multiple users to run jobs.
>
> *Current focus:*
>
> ·         I am currently working on building a Thrift based client for
> Aurora, and have been successful in implementing one, but with limited
> operations.
> ·         I will be adding support for more operations keeping them
> aligned to Airavata job submission/monitoring requirements.
> ·         I am currently focusing on targeting Airavata deployment to
> Mesos on a single cluster (eg: AWS). The flow would look like follows:
> <image001.png>
> ·         As you can see, currently there is just a single Mesos cluster.
> The future focus would be to expand this to have multiple clusters.
>
> *Subsequent work:*
> ·         Once we are able to test Airavata deployment to single cluster
> successfully, we can expand this to a multi-cluster environment.
> ·         Here we would multiple Mesos clusters which would somehow need
> to be managed. But, the overall flow would look like follows:
> <image002.png>
>
> ·         We can either have multiple Mesos masters (for each individual
> cluster), that are connected to each other via VPN, or have a single master
> – in which case we would need to consider all other nodes as slaves.
> ·         This is a design issue which needs discussion, and Suresh has
> some ideas on how to do this.
>
> Thanks and Regards,
> Gourav Shenoy
>
> *From: *Suresh Marru <sm...@apache.org>
> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
> *Date: *Friday, October 7, 2016 at 11:43 PM
> *To: *Airavata Dev <de...@airavata.apache.org>
> *Subject: *Re: Mesos based meta-scheduling for Airavata
>
> Hi Gourav,
>
> Thank you for the nice informative summaries, posts like these are always
> educational. Keep’em coming.
>
> Suresh
>
>
> On Oct 7, 2016, at 10:56 PM, Shenoy, Gourav Ganesh <go...@indiana.edu>
> wrote:
>
> Hi dev,
>
> I have been exploring different frameworks for Mesos which would help our
> use-case of providing Airavata the capability to run jobs in a Mesos based
> ecosystem. In particular, I have been playing around with Marathon &
> Chronos and I am now going to be working on Apache Aurora.
>
> I have summarized my understanding about Mesos, Marathon & Chronos below.
> I will send out a separate email about Aurora later.
>
> *Apache Mesos:*
>
> ·         Apache Mesos is an open-source cluster manager, in the sense
> that it helps deploy & manage different frameworks (or applications) in a
> large clustered environment easily.
> ·         Mesos provides the ability to utilize underlying shared pool of
> nodes as a single compute unit – That is, it can run many applications on
> these nodes efficiently.
> ·         Mesos uses the concept of “offers” for scheduling and running
> jobs on the underlying nodes. When a framework (application) wants to run
> computations/jobs on the cluster, Mesos will decide how many resources it
> will “offer” that framework based on the availability. The framework will
> then decide which resources to use from the offer, and subsequently run the
> computation/job on that resource.
> ·         In a typical cluster, you will have 3 or more Mesos masters &
> multiple Mesos slaves. Multiple mesos masters help in providing high
> availability – if one master goes down, Mesos will reelect a new leader
> (master) – using Zookeeper.
> ·         The task mentioned above of providing “offers” to frameworks is
> done by a master, whereas the slaves are the ones who run these
> computations.
>
> ·         Some additional points:
> o    I built a Mesos cluster with 3 masters & 2 slaves on EC2.
> o    Each master & slave have 1GB of RAM & 1vCPU with 20GB of disk space.
>
> *Marathon:*
>
> ·         Marathon is considered a framework that runs on top of Mesos.
> It is a container orchestration platform for Mesos and essentially acts as
> a service scheduler.
> ·         It is named “marathon” because it is intended for long running
> applications. That is, Marathon makes sure that the service it is running
> never stops – if a service goes down or the slave on which the service is
> run dies, marathon keeps re-starting it on different slaves.
> ·         In some sense Marathon is very good for ensuring high
> availability of services. That is, instead of running services directly on
> Mesos, run it in Marathon if you never want it to die.
> *Note*: You can decide to run a service on multiple slave nodes and if
> resources on these slaves are available, Mesos will “offer” them to
> Marathon.
> ·         It is called a container orchestration platform because it
> “launches” these services inside a container – either Docker OR Mesos
> container.
> ·         In my opinion it is not a suitable “job scheduler” for Airavata
> because in Airavata we need to run a job and get the output rather than
> keeping it running always. Instead, we can run other schedulers –
> chronos/aurora as a service in Marathon.
>
>
> *Chronos:*
>
> ·         Chronos is a Cron scheduler for Mesos. It is good for running
> scheduled jobs – jobs that need to be run for a certain number of times,
> repeatedly after certain intervals.
> ·         Chronos also provides the ability to add dependencies between
> jobs – That is, if a job1 is dependent on another job2 then it will run
> job1 first and then run job2 after job1 completes. It also builds a
> Directed Acyclic Graph (DAG) based on these dependencies.
> ·         Similar to Marathon, Chronos receives “offers” from Mesos
> master whenever it needs to run a job on Mesos.
> ·         Again, I found that Chronos does not fit the Airavata use-case
> since I could not find a way to run one-off jobs via Chronos – you need to
> specify interval time for Chronos, & Chronos then re-runs the job after
> that interval is complete (even if you decide to specify num. of
> repetitions=1).
>
>
> Some additional points:
> ·         Marathon & Chronos both have REST API support – eg: you can
> submit jobs via APIs along with other interactions such as list jobs, etc.
> ·         I installed Marathon & Chronos frameworks on the Mesos master
> nodes. This is how their health looks like on the Mesos dashboard:
>
> <image002.png>
>                 As you can see, there are 3 active tasks running in
> Chronos & 4 active tasks (long running) in Marathon.
>
> ·         I also installed Chronos as a service inside Marathon, and this
> is how it looks like in the Marathon UI:
>
>
> <image004.png>
> Interestingly, Chronos (as a service in Marathon) was smart enough to
> identify the jobs submitted via Chronos (as a framework on Mesos) &
> vice-versa.
>
> ·         Also, Mesos dashboard lists the active tasks it is running &
> details about which slave the task is running on. It also lists Completed
> tasks. The “Sandbox” gives you access to the stdout/stderr files for the
> tasks as well as any other directories that were created as part of the
> task.
>
>
> <image005.png>
>
> Pardon me for this long email. Next, I will explore Apache Aurora which
> seems a better fit for Airavata use-case because it provides the features
> that Chronos supports, as well as can run one-off jobs.
>
> Thanks and Regards,
> Gourav Shenoy
>
> *From: *"Shenoy, Gourav Ganesh" <go...@indiana.edu>
> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
> *Date: *Friday, September 23, 2016 at 4:43 PM
> *To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
> *Subject: *Mesos based meta-scheduling for Airavata
>
> Hi Dev,
>
> I am working on this project of building a Mesos based meta-scheduler for
> Airavata, along with Shameera & Mangirish. Here is the jira link:
> https://issues.apache.org/jira/browse/AIRAVATA-2082.
>
> ·         We have identified some tasks that would be needed for
> achieving this, and at the higher level it would consist of:
> 1.      Resource provisioning – We need to provision resources on cloud &
> hpc infrastructures such as EC2, Jetstream, Comet, etc.
> 2.      Building a cluster – Deploying a Mesos cluster on set of nodes
> obtained from (1) above for task management.
> 3.      Selecting a scheduler – We need to investigate the scheduler to
> use with Mesos cluster. Some of the options are Marathon, Aurora. But we
> need to find one that suits our needs of running serial as well as parallel
> (MPI) jobs.
> 4.      Installing & running applications on this cluster – Once the
> cluster has been deployed and a scheduler choice made, we need to be able
> to install and run applications on this cluster using Airavata.
>
> ·         Until now we were able to look into the following:
> o   Resource provisioning:
> §  We explored several options of provisioning resources – using cloud
> libraries as well as via ansible scripts.
> §  We built a OpenStack4J Java module which would provision instances on
> OpenStack based clouds (eg: Jetstream).
> §  We also built a CloudBridge Python module for provisioning EC2
> instances on Amazon. CloudBridge can also be used to provision instances on
> OpenStack
> §  We wrote Ansible scripts for bringing up instances on both AWS and
> OpenStack based clouds.
>
> §  *Key Points*: CloudBridge, OpenStack4J are powerful libraries for
> resource provisioning, but currently they do single-instance provisioning,
> and not support templated boot options such as CloudFormation (for AWS) &
> Heat (for OpenStack).
>
> o   Building a cluster:
> §  We wrote Ansible script for deploying a Mesos-Marathon cluster on a
> set of nodes. This script will install necessary dependencies such as
> Zookeeper.
> §  We tested this on OpenStack based clouds & on EC2.
> §  OpenStack Magnum provides excellent support for doing resource
> provisioning & deploying mesos cluster, but we are running into some
> problems while trying it.
>
> o   Installing a scheduler:
> §  Our Ansible script is currently installing Marathon as the scheduler
> on Mesos. We haven’t yet submitted jobs using Marathon.
>
> ·         Although not finalized, but we are inclined towards using
> Ansible approach for the above, as Ansible also provides Python APIs and
> which will allow us to integrate it with Airavata via Thrift. Hence we will
> be able to easily invoke the Ansible scripts from code without needing to
> use the command-line interface.
>
> ·         We are also progressively working on some work-items such as:
> o   Exploring options to provision and deploy a Mesos-Marathon cluster on
> HPC systems such as Comet. The challenge would be to use Ansible to
> provision resources and deploy the cluster. Once we have a cluster, we can
> try running applications.
> o   Exploring different scheduler options for running serial and parallel
> (MPI) jobs on such heterogeneous clusters.
> o   Exploring orchestration options such as OpenStack Heat, AWS
> CloudFormation, OpenStack Magnum, etc.
>
> Any suggestions and comments are highly appreciated.
>
> Thanks and Regards,
> Gourav Shenoy
>
>
>
>
>

Re: Mesos based meta-scheduling for Airavata

Posted by Suresh Marru <sm...@apache.org>.
Hi Renan,

Since you did a similar exercise using Go [1], it will be nice to see your feedback and guidance on the discussions Gourav is summarizing below. 

Suresh

[1] - http://markmail.org/thread/ymj7yqvvbhrjwv3s <http://markmail.org/thread/ymj7yqvvbhrjwv3s>

> On Oct 17, 2016, at 11:32 PM, Shenoy, Gourav Ganesh <go...@indiana.edu> wrote:
> 
> Hi dev,
>  
> Now that I have been able to get jobs scheduled via Aurora, I thought I should summarize my understanding. I would also like to briefly draw out the plan which I am working on with respect to using Mesos with Airavata.
>  
> Apache Aurora:
>  
> ·         Aurora, similar to Marathon & Chronos, is a service scheduler framework for Mesos. It has been built for scheduling long running services & cron jobs on Mesos.
> ·         The advantage with Aurora (over Marathon & Chronos) is that it works well for one-off jobs as well – i.e. If I want to run a job and get the output, Aurora is a better fit than Marathon & Chronos, since Marathon will never let the job exit (and keep restarting it on slaves) & Chronos is ONLY for crons.
> ·         Aurora also allows fine grained control of the jobs that need to be submitted – the concept of jobs, tasks, processes – a job can consist of one or more tasks, and a task can consist of one or more processes.
> ·         Aurora manages jobs that are made up of tasks; Mesos manages the tasks that consist of processes; Thermos (is the Aurora executor) manages the processes.
> ·         We can control resource utilization at task level because of the above job abstractions that Aurora provides.
> ·         Among many other features, a useful one is the resource-quota management for users & the ability to support multiple users to run jobs.
>  
> Current focus:
>  
> ·         I am currently working on building a Thrift based client for Aurora, and have been successful in implementing one, but with limited operations.
> ·         I will be adding support for more operations keeping them aligned to Airavata job submission/monitoring requirements.
> ·         I am currently focusing on targeting Airavata deployment to Mesos on a single cluster (eg: AWS). The flow would look like follows:
> <image001.png>
> ·         As you can see, currently there is just a single Mesos cluster. The future focus would be to expand this to have multiple clusters.
>  
> Subsequent work:
> ·         Once we are able to test Airavata deployment to single cluster successfully, we can expand this to a multi-cluster environment.
> ·         Here we would multiple Mesos clusters which would somehow need to be managed. But, the overall flow would look like follows:
> <image002.png>
>  
> ·         We can either have multiple Mesos masters (for each individual cluster), that are connected to each other via VPN, or have a single master – in which case we would need to consider all other nodes as slaves.
> ·         This is a design issue which needs discussion, and Suresh has some ideas on how to do this.
>  
> Thanks and Regards,
> Gourav Shenoy
>  
> From: Suresh Marru <sm...@apache.org>
> Reply-To: "dev@airavata.apache.org" <de...@airavata.apache.org>
> Date: Friday, October 7, 2016 at 11:43 PM
> To: Airavata Dev <de...@airavata.apache.org>
> Subject: Re: Mesos based meta-scheduling for Airavata
>  
> Hi Gourav, 
>  
> Thank you for the nice informative summaries, posts like these are always educational. Keep’em coming. 
>  
> Suresh
>  
> On Oct 7, 2016, at 10:56 PM, Shenoy, Gourav Ganesh <goshenoy@indiana.edu <ma...@indiana.edu>> wrote:
>  
> Hi dev,
>  
> I have been exploring different frameworks for Mesos which would help our use-case of providing Airavata the capability to run jobs in a Mesos based ecosystem. In particular, I have been playing around with Marathon & Chronos and I am now going to be working on Apache Aurora. 
>  
> I have summarized my understanding about Mesos, Marathon & Chronos below. I will send out a separate email about Aurora later.
>  
> Apache Mesos:
>  
> ·         Apache Mesos is an open-source cluster manager, in the sense that it helps deploy & manage different frameworks (or applications) in a large clustered environment easily.
> ·         Mesos provides the ability to utilize underlying shared pool of nodes as a single compute unit – That is, it can run many applications on these nodes efficiently.
> ·         Mesos uses the concept of “offers” for scheduling and running jobs on the underlying nodes. When a framework (application) wants to run computations/jobs on the cluster, Mesos will decide how many resources it will “offer” that framework based on the availability. The framework will then decide which resources to use from the offer, and subsequently run the computation/job on that resource.
> ·         In a typical cluster, you will have 3 or more Mesos masters & multiple Mesos slaves. Multiple mesos masters help in providing high availability – if one master goes down, Mesos will reelect a new leader (master) – using Zookeeper.
> ·         The task mentioned above of providing “offers” to frameworks is done by a master, whereas the slaves are the ones who run these computations.
>  
> ·         Some additional points:
> o    I built a Mesos cluster with 3 masters & 2 slaves on EC2.
> o    Each master & slave have 1GB of RAM & 1vCPU with 20GB of disk space.
>  
> Marathon:
>  
> ·         Marathon is considered a framework that runs on top of Mesos. It is a container orchestration platform for Mesos and essentially acts as a service scheduler.
> ·         It is named “marathon” because it is intended for long running applications. That is, Marathon makes sure that the service it is running never stops – if a service goes down or the slave on which the service is run dies, marathon keeps re-starting it on different slaves. 
> ·         In some sense Marathon is very good for ensuring high availability of services. That is, instead of running services directly on Mesos, run it in Marathon if you never want it to die.
> Note: You can decide to run a service on multiple slave nodes and if resources on these slaves are available, Mesos will “offer” them to Marathon.
> ·         It is called a container orchestration platform because it “launches” these services inside a container – either Docker OR Mesos container.
> ·         In my opinion it is not a suitable “job scheduler” for Airavata because in Airavata we need to run a job and get the output rather than keeping it running always. Instead, we can run other schedulers – chronos/aurora as a service in Marathon.
> 
> 
> Chronos:
>  
> ·         Chronos is a Cron scheduler for Mesos. It is good for running scheduled jobs – jobs that need to be run for a certain number of times, repeatedly after certain intervals.
> ·         Chronos also provides the ability to add dependencies between jobs – That is, if a job1 is dependent on another job2 then it will run job1 first and then run job2 after job1 completes. It also builds a Directed Acyclic Graph (DAG) based on these dependencies.
> ·         Similar to Marathon, Chronos receives “offers” from Mesos master whenever it needs to run a job on Mesos.
> ·         Again, I found that Chronos does not fit the Airavata use-case since I could not find a way to run one-off jobs via Chronos – you need to specify interval time for Chronos, & Chronos then re-runs the job after that interval is complete (even if you decide to specify num. of repetitions=1).
>  
>  
> Some additional points:
> ·         Marathon & Chronos both have REST API support – eg: you can submit jobs via APIs along with other interactions such as list jobs, etc.
> ·         I installed Marathon & Chronos frameworks on the Mesos master nodes. This is how their health looks like on the Mesos dashboard:
>  
> <image002.png>
>                 As you can see, there are 3 active tasks running in Chronos & 4 active tasks (long running) in Marathon.
>  
> ·         I also installed Chronos as a service inside Marathon, and this is how it looks like in the Marathon UI:
> 
> 
> <image004.png>
> Interestingly, Chronos (as a service in Marathon) was smart enough to identify the jobs submitted via Chronos (as a framework on Mesos) & vice-versa.
>  
> ·         Also, Mesos dashboard lists the active tasks it is running & details about which slave the task is running on. It also lists Completed tasks. The “Sandbox” gives you access to the stdout/stderr files for the tasks as well as any other directories that were created as part of the task.
> 
> 
> <image005.png>
>  
> Pardon me for this long email. Next, I will explore Apache Aurora which seems a better fit for Airavata use-case because it provides the features that Chronos supports, as well as can run one-off jobs.
>  
> Thanks and Regards,
> Gourav Shenoy
>  
> From: "Shenoy, Gourav Ganesh" <goshenoy@indiana.edu <ma...@indiana.edu>>
> Reply-To: "dev@airavata.apache.org <ma...@airavata.apache.org>" <dev@airavata.apache.org <ma...@airavata.apache.org>>
> Date: Friday, September 23, 2016 at 4:43 PM
> To: "dev@airavata.apache.org <ma...@airavata.apache.org>" <dev@airavata.apache.org <ma...@airavata.apache.org>>
> Subject: Mesos based meta-scheduling for Airavata
>  
> Hi Dev,
>  
> I am working on this project of building a Mesos based meta-scheduler for Airavata, along with Shameera & Mangirish. Here is the jira link:https://issues.apache.org/jira/browse/AIRAVATA-2082 <https://issues.apache.org/jira/browse/AIRAVATA-2082>.
>  
> ·         We have identified some tasks that would be needed for achieving this, and at the higher level it would consist of:
> 1.      Resource provisioning – We need to provision resources on cloud & hpc infrastructures such as EC2, Jetstream, Comet, etc.
> 2.      Building a cluster – Deploying a Mesos cluster on set of nodes obtained from (1) above for task management.
> 3.      Selecting a scheduler – We need to investigate the scheduler to use with Mesos cluster. Some of the options are Marathon, Aurora. But we need to find one that suits our needs of running serial as well as parallel (MPI) jobs.
> 4.      Installing & running applications on this cluster – Once the cluster has been deployed and a scheduler choice made, we need to be able to install and run applications on this cluster using Airavata.
>  
> ·         Until now we were able to look into the following:
> o   Resource provisioning:
> §  We explored several options of provisioning resources – using cloud libraries as well as via ansible scripts.
> §  We built a OpenStack4J Java module which would provision instances on OpenStack based clouds (eg: Jetstream).
> §  We also built a CloudBridge Python module for provisioning EC2 instances on Amazon. CloudBridge can also be used to provision instances on OpenStack
> §  We wrote Ansible scripts for bringing up instances on both AWS and OpenStack based clouds.
>  
> §  Key Points: CloudBridge, OpenStack4J are powerful libraries for resource provisioning, but currently they do single-instance provisioning, and not support templated boot options such as CloudFormation (for AWS) & Heat (for OpenStack).
>  
> o   Building a cluster:
> §  We wrote Ansible script for deploying a Mesos-Marathon cluster on a set of nodes. This script will install necessary dependencies such as Zookeeper.
> §  We tested this on OpenStack based clouds & on EC2.
> §  OpenStack Magnum provides excellent support for doing resource provisioning & deploying mesos cluster, but we are running into some problems while trying it.
>  
> o   Installing a scheduler:
> §  Our Ansible script is currently installing Marathon as the scheduler on Mesos. We haven’t yet submitted jobs using Marathon.
>  
> ·         Although not finalized, but we are inclined towards using Ansible approach for the above, as Ansible also provides Python APIs and which will allow us to integrate it with Airavata via Thrift. Hence we will be able to easily invoke the Ansible scripts from code without needing to use the command-line interface.
>  
> ·         We are also progressively working on some work-items such as:
> o   Exploring options to provision and deploy a Mesos-Marathon cluster on HPC systems such as Comet. The challenge would be to use Ansible to provision resources and deploy the cluster. Once we have a cluster, we can try running applications.
> o   Exploring different scheduler options for running serial and parallel (MPI) jobs on such heterogeneous clusters.
> o   Exploring orchestration options such as OpenStack Heat, AWS CloudFormation, OpenStack Magnum, etc.
>  
> Any suggestions and comments are highly appreciated.
>  
> Thanks and Regards,
> Gourav Shenoy
>  


Re: Mesos based meta-scheduling for Airavata

Posted by "Shenoy, Gourav Ganesh" <go...@indiana.edu>.
Hi Mark, Thejaka,

Thank you for bringing up these compelling points in this discussion. I really appreciate it. @Thejaka, I will respond to your questions in a different email (it might take me some time to summarize).

> “the question for me will be, do we have multiple redundant clusters on separate machines with the correct provisioning for my jobs?”
The plan is to have multiple clusters integrated with Airavata, but somehow have a centralized control of “which type of jobs” should run on each cluster. A simple example would be to have a mesos-cluster equipped with MPI running capability, another with Hadoop, may be a third with serial job running ability. Then decide which target cluster will the jobs run based on the “type” of the job.

> “But I am not sure about the efficiency/effectiveness of using resources in that way”
I am not sure if the performance will be affected, as you can still specify the complete set of resources needed for the job; and then the scheduler (either aurora/custom scheduler framework) will route this job based on the cluster that meets the resource needs. What we are still investigating is, does Aurora allow us to “know” what resources are available and then make a decision accordingly. If not, we might have to think about writing a custom scheduler that does this.

What we have understood is that Aurora scheduler might not be the only option for scheduling jobs, as there are different other job types which need different ways to manage resources. Example: Hadoop framework for Mesos, which has been written to run big-data jobs. We might not be able to use Aurora to run big-data jobs. I am not sure if I have answered your questions correctly, as many of these are excellent points which still need some investigation and I really appreciate you asking them in this thread.

Thanks and Regards,
Gourav Shenoy

From: "Miller, Mark" <mm...@sdsc.edu>
Reply-To: "dev@airavata.apache.org" <de...@airavata.apache.org>
Date: Friday, October 28, 2016 at 12:48 PM
To: "dev@airavata.apache.org" <de...@airavata.apache.org>
Subject: RE: Mesos based meta-scheduling for Airavata

Hi Gourav,
As Thejaka suggested, I do still have questions, but I really appreciate the clear summary, which will help me focus my questions better as it helps me understand more.
I think I understand the direction, and when you reach your goal of having multiple clusters, the question for me will be, do we have multiple redundant clusters on separate machines with the correct provisioning for my jobs?
And I realize I have some control, for I can configure my jobs for a minimal set of provisions that all the available machines have. But I am not sure about the efficiency/effectiveness of using resources in that way, and I am not
Sure it gives me full advantage of the best of my available resources.  This is not exactly a coding decision, but is a more high level policy or philosophy question. What is the performance cost of making it easy to move from machine to machine in this environment?
And how does it compare to the cost of the annoying/error prone task of moving between resources in our current CIPRES implementation?  Is there a way to reconfigure jobs on the fly so, when we know where they are going, we can adjust the resources requested in the configuration for that job so we get the most out of each machine? It becomes something of an AI or at least smart system question (in my understanding of those terms). And then there is the question of what we gain or lose by mapping a given job to a given resource. I understand answering these questions are not your mandate, and I don’t want to distract from the nice progress you are making, just tossing out the concerns I am thinking about overall for CIPRES.

Mark

From: Amila Jayasekara [mailto:thejaka.amila@gmail.com]
Sent: Friday, October 28, 2016 9:30 AM
To: dev <de...@airavata.apache.org>
Subject: Re: Mesos based meta-scheduling for Airavata

Hi Gourav,

These are excellent descriptions, but it would be useful if you can lay out your findings according to Mark's questions from the other thread. As per my understanding Mark's question is still not answered (I hope Mark will agree with me).

Also, I am confused about the terminologies used in these tools.
For example, what is the difference between Aurora tasks and Mesos task and what is the difference between thermos and mesos task? In fact, what is the definition of a task in this context? I assume "process" has the standard definition from OS books (a running program).

In MPI we use "aprun -n 32 -N 16 ./a.out" (Suppose we request 2 nodes). Here the whole command ("aprun -n 4 -N 2 ./a.out") is the job and "-n" specifies the number of tasks and "-N" specifies the number of tasks per node (two nodes comprises (16 * 2 =) 32 tasks). So how do these (MPI) tasks related to Aurora/Mesos tasks?

Further, the "-N" parameter depends on the type of nodes we are using and the number of cores a node has. For example, if the resource has 16 cores per node we will use the above command to run 32 tasks, but if a node has eight cores, then we will use a command like  "aprun -n 32 -N 8 ./a.out" (in this case we have to request four nodes). So given a command like "aprun -n 32 ./a.out", is Mesos/Aurora capable of adding "-N" parameter to the command based on the cluster and types of nodes ?

Thanks
-Thejaka



On Mon, Oct 17, 2016 at 11:32 PM, Shenoy, Gourav Ganesh <go...@indiana.edu>> wrote:
Hi dev,

Now that I have been able to get jobs scheduled via Aurora, I thought I should summarize my understanding. I would also like to briefly draw out the plan which I am working on with respect to using Mesos with Airavata.

Apache Aurora:


•         Aurora, similar to Marathon & Chronos, is a service scheduler framework for Mesos. It has been built for scheduling long running services & cron jobs on Mesos.

•         The advantage with Aurora (over Marathon & Chronos) is that it works well for one-off jobs as well – i.e. If I want to run a job and get the output, Aurora is a better fit than Marathon & Chronos, since Marathon will never let the job exit (and keep restarting it on slaves) & Chronos is ONLY for crons.

•         Aurora also allows fine grained control of the jobs that need to be submitted – the concept of jobs, tasks, processes – a job can consist of one or more tasks, and a task can consist of one or more processes.

•         Aurora manages jobs that are made up of tasks; Mesos manages the tasks that consist of processes; Thermos (is the Aurora executor) manages the processes.

•         We can control resource utilization at task level because of the above job abstractions that Aurora provides.

•         Among many other features, a useful one is the resource-quota management for users & the ability to support multiple users to run jobs.

Current focus:


•         I am currently working on building a Thrift based client for Aurora, and have been successful in implementing one, but with limited operations.

•         I will be adding support for more operations keeping them aligned to Airavata job submission/monitoring requirements.

•         I am currently focusing on targeting Airavata deployment to Mesos on a single cluster (eg: AWS). The flow would look like follows:

[cid:image001.png@01D2311E.3B50D9E0]

•         As you can see, currently there is just a single Mesos cluster. The future focus would be to expand this to have multiple clusters.

Subsequent work:

•         Once we are able to test Airavata deployment to single cluster successfully, we can expand this to a multi-cluster environment.

•         Here we would multiple Mesos clusters which would somehow need to be managed. But, the overall flow would look like follows:

[cid:image002.png@01D2311E.3B50D9E0]



•         We can either have multiple Mesos masters (for each individual cluster), that are connected to each other via VPN, or have a single master – in which case we would need to consider all other nodes as slaves.

•         This is a design issue which needs discussion, and Suresh has some ideas on how to do this.

Thanks and Regards,
Gourav Shenoy

From: Suresh Marru <sm...@apache.org>>
Reply-To: "dev@airavata.apache.org<ma...@airavata.apache.org>" <de...@airavata.apache.org>>
Date: Friday, October 7, 2016 at 11:43 PM
To: Airavata Dev <de...@airavata.apache.org>>
Subject: Re: Mesos based meta-scheduling for Airavata

Hi Gourav,

Thank you for the nice informative summaries, posts like these are always educational. Keep’em coming.

Suresh

On Oct 7, 2016, at 10:56 PM, Shenoy, Gourav Ganesh <go...@indiana.edu>> wrote:

Hi dev,

I have been exploring different frameworks for Mesos which would help our use-case of providing Airavata the capability to run jobs in a Mesos based ecosystem. In particular, I have been playing around with Marathon & Chronos and I am now going to be working on Apache Aurora.

I have summarized my understanding about Mesos, Marathon & Chronos below. I will send out a separate email about Aurora later.

Apache Mesos:

•         Apache Mesos is an open-source cluster manager, in the sense that it helps deploy & manage different frameworks (or applications) in a large clustered environment easily.
•         Mesos provides the ability to utilize underlying shared pool of nodes as a single compute unit – That is, it can run many applications on these nodes efficiently.
•         Mesos uses the concept of “offers” for scheduling and running jobs on the underlying nodes. When a framework (application) wants to run computations/jobs on the cluster, Mesos will decide how many resources it will “offer” that framework based on the availability. The framework will then decide which resources to use from the offer, and subsequently run the computation/job on that resource.
•         In a typical cluster, you will have 3 or more Mesos masters & multiple Mesos slaves. Multiple mesos masters help in providing high availability – if one master goes down, Mesos will reelect a new leader (master) – using Zookeeper.
•         The task mentioned above of providing “offers” to frameworks is done by a master, whereas the slaves are the ones who run these computations.

•         Some additional points:
o    I built a Mesos cluster with 3 masters & 2 slaves on EC2.
o    Each master & slave have 1GB of RAM & 1vCPU with 20GB of disk space.

Marathon:

•         Marathon is considered a framework that runs on top of Mesos. It is a container orchestration platform for Mesos and essentially acts as a service scheduler.
•         It is named “marathon” because it is intended for long running applications. That is, Marathon makes sure that the service it is running never stops – if a service goes down or the slave on which the service is run dies, marathon keeps re-starting it on different slaves.
•         In some sense Marathon is very good for ensuring high availability of services. That is, instead of running services directly on Mesos, run it in Marathon if you never want it to die.
Note: You can decide to run a service on multiple slave nodes and if resources on these slaves are available, Mesos will “offer” them to Marathon.
•         It is called a container orchestration platform because it “launches” these services inside a container – either Docker OR Mesos container.
•         In my opinion it is not a suitable “job scheduler” for Airavata because in Airavata we need to run a job and get the output rather than keeping it running always. Instead, we can run other schedulers – chronos/aurora as a service in Marathon.


Chronos:

•         Chronos is a Cron scheduler for Mesos. It is good for running scheduled jobs – jobs that need to be run for a certain number of times, repeatedly after certain intervals.
•         Chronos also provides the ability to add dependencies between jobs – That is, if a job1 is dependent on another job2 then it will run job1 first and then run job2 after job1 completes. It also builds a Directed Acyclic Graph (DAG) based on these dependencies.
•         Similar to Marathon, Chronos receives “offers” from Mesos master whenever it needs to run a job on Mesos.
•         Again, I found that Chronos does not fit the Airavata use-case since I could not find a way to run one-off jobs via Chronos – you need to specify interval time for Chronos, & Chronos then re-runs the job after that interval is complete (even if you decide to specify num. of repetitions=1).


Some additional points:
•         Marathon & Chronos both have REST API support – eg: you can submit jobs via APIs along with other interactions such as list jobs, etc.
•         I installed Marathon & Chronos frameworks on the Mesos master nodes. This is how their health looks like on the Mesos dashboard:

<image002.png>
                As you can see, there are 3 active tasks running in Chronos & 4 active tasks (long running) in Marathon.

•         I also installed Chronos as a service inside Marathon, and this is how it looks like in the Marathon UI:


<image004.png>
Interestingly, Chronos (as a service in Marathon) was smart enough to identify the jobs submitted via Chronos (as a framework on Mesos) & vice-versa.

•         Also, Mesos dashboard lists the active tasks it is running & details about which slave the task is running on. It also lists Completed tasks. The “Sandbox” gives you access to the stdout/stderr files for the tasks as well as any other directories that were created as part of the task.


<image005.png>

Pardon me for this long email. Next, I will explore Apache Aurora which seems a better fit for Airavata use-case because it provides the features that Chronos supports, as well as can run one-off jobs.

Thanks and Regards,
Gourav Shenoy

From: "Shenoy, Gourav Ganesh" <go...@indiana.edu>>
Reply-To: "dev@airavata.apache.org<ma...@airavata.apache.org>" <de...@airavata.apache.org>>
Date: Friday, September 23, 2016 at 4:43 PM
To: "dev@airavata.apache.org<ma...@airavata.apache.org>" <de...@airavata.apache.org>>
Subject: Mesos based meta-scheduling for Airavata

Hi Dev,

I am working on this project of building a Mesos based meta-scheduler for Airavata, along with Shameera & Mangirish. Here is the jira link:https://issues.apache.org/jira/browse/AIRAVATA-2082.

•         We have identified some tasks that would be needed for achieving this, and at the higher level it would consist of:
1.      Resource provisioning – We need to provision resources on cloud & hpc infrastructures such as EC2, Jetstream, Comet, etc.
2.      Building a cluster – Deploying a Mesos cluster on set of nodes obtained from (1) above for task management.
3.      Selecting a scheduler – We need to investigate the scheduler to use with Mesos cluster. Some of the options are Marathon, Aurora. But we need to find one that suits our needs of running serial as well as parallel (MPI) jobs.
4.      Installing & running applications on this cluster – Once the cluster has been deployed and a scheduler choice made, we need to be able to install and run applications on this cluster using Airavata.

•         Until now we were able to look into the following:
o   Resource provisioning:
•  We explored several options of provisioning resources – using cloud libraries as well as via ansible scripts.
•  We built a OpenStack4J Java module which would provision instances on OpenStack based clouds (eg: Jetstream).
•  We also built a CloudBridge Python module for provisioning EC2 instances on Amazon. CloudBridge can also be used to provision instances on OpenStack
•  We wrote Ansible scripts for bringing up instances on both AWS and OpenStack based clouds.

•  Key Points: CloudBridge, OpenStack4J are powerful libraries for resource provisioning, but currently they do single-instance provisioning, and not support templated boot options such as CloudFormation (for AWS) & Heat (for OpenStack).

o   Building a cluster:
•  We wrote Ansible script for deploying a Mesos-Marathon cluster on a set of nodes. This script will install necessary dependencies such as Zookeeper.
•  We tested this on OpenStack based clouds & on EC2.
•  OpenStack Magnum provides excellent support for doing resource provisioning & deploying mesos cluster, but we are running into some problems while trying it.

o   Installing a scheduler:
•  Our Ansible script is currently installing Marathon as the scheduler on Mesos. We haven’t yet submitted jobs using Marathon.

•         Although not finalized, but we are inclined towards using Ansible approach for the above, as Ansible also provides Python APIs and which will allow us to integrate it with Airavata via Thrift. Hence we will be able to easily invoke the Ansible scripts from code without needing to use the command-line interface.

•         We are also progressively working on some work-items such as:
o   Exploring options to provision and deploy a Mesos-Marathon cluster on HPC systems such as Comet. The challenge would be to use Ansible to provision resources and deploy the cluster. Once we have a cluster, we can try running applications.
o   Exploring different scheduler options for running serial and parallel (MPI) jobs on such heterogeneous clusters.
o   Exploring orchestration options such as OpenStack Heat, AWS CloudFormation, OpenStack Magnum, etc.

Any suggestions and comments are highly appreciated.

Thanks and Regards,
Gourav Shenoy



RE: Mesos based meta-scheduling for Airavata

Posted by "Miller, Mark" <mm...@sdsc.edu>.
Hi Gourav,
As Thejaka suggested, I do still have questions, but I really appreciate the clear summary, which will help me focus my questions better as it helps me understand more.
I think I understand the direction, and when you reach your goal of having multiple clusters, the question for me will be, do we have multiple redundant clusters on separate machines with the correct provisioning for my jobs?
And I realize I have some control, for I can configure my jobs for a minimal set of provisions that all the available machines have. But I am not sure about the efficiency/effectiveness of using resources in that way, and I am not
Sure it gives me full advantage of the best of my available resources.  This is not exactly a coding decision, but is a more high level policy or philosophy question. What is the performance cost of making it easy to move from machine to machine in this environment?
And how does it compare to the cost of the annoying/error prone task of moving between resources in our current CIPRES implementation?  Is there a way to reconfigure jobs on the fly so, when we know where they are going, we can adjust the resources requested in the configuration for that job so we get the most out of each machine? It becomes something of an AI or at least smart system question (in my understanding of those terms). And then there is the question of what we gain or lose by mapping a given job to a given resource. I understand answering these questions are not your mandate, and I don’t want to distract from the nice progress you are making, just tossing out the concerns I am thinking about overall for CIPRES.

Mark

From: Amila Jayasekara [mailto:thejaka.amila@gmail.com]
Sent: Friday, October 28, 2016 9:30 AM
To: dev <de...@airavata.apache.org>
Subject: Re: Mesos based meta-scheduling for Airavata

Hi Gourav,

These are excellent descriptions, but it would be useful if you can lay out your findings according to Mark's questions from the other thread. As per my understanding Mark's question is still not answered (I hope Mark will agree with me).

Also, I am confused about the terminologies used in these tools.
For example, what is the difference between Aurora tasks and Mesos task and what is the difference between thermos and mesos task? In fact, what is the definition of a task in this context? I assume "process" has the standard definition from OS books (a running program).

In MPI we use "aprun -n 32 -N 16 ./a.out" (Suppose we request 2 nodes). Here the whole command ("aprun -n 4 -N 2 ./a.out") is the job and "-n" specifies the number of tasks and "-N" specifies the number of tasks per node (two nodes comprises (16 * 2 =) 32 tasks). So how do these (MPI) tasks related to Aurora/Mesos tasks?

Further, the "-N" parameter depends on the type of nodes we are using and the number of cores a node has. For example, if the resource has 16 cores per node we will use the above command to run 32 tasks, but if a node has eight cores, then we will use a command like  "aprun -n 32 -N 8 ./a.out" (in this case we have to request four nodes). So given a command like "aprun -n 32 ./a.out", is Mesos/Aurora capable of adding "-N" parameter to the command based on the cluster and types of nodes ?

Thanks
-Thejaka



On Mon, Oct 17, 2016 at 11:32 PM, Shenoy, Gourav Ganesh <go...@indiana.edu>> wrote:
Hi dev,

Now that I have been able to get jobs scheduled via Aurora, I thought I should summarize my understanding. I would also like to briefly draw out the plan which I am working on with respect to using Mesos with Airavata.

Apache Aurora:


•         Aurora, similar to Marathon & Chronos, is a service scheduler framework for Mesos. It has been built for scheduling long running services & cron jobs on Mesos.

•         The advantage with Aurora (over Marathon & Chronos) is that it works well for one-off jobs as well – i.e. If I want to run a job and get the output, Aurora is a better fit than Marathon & Chronos, since Marathon will never let the job exit (and keep restarting it on slaves) & Chronos is ONLY for crons.

•         Aurora also allows fine grained control of the jobs that need to be submitted – the concept of jobs, tasks, processes – a job can consist of one or more tasks, and a task can consist of one or more processes.

•         Aurora manages jobs that are made up of tasks; Mesos manages the tasks that consist of processes; Thermos (is the Aurora executor) manages the processes.

•         We can control resource utilization at task level because of the above job abstractions that Aurora provides.

•         Among many other features, a useful one is the resource-quota management for users & the ability to support multiple users to run jobs.

Current focus:


•         I am currently working on building a Thrift based client for Aurora, and have been successful in implementing one, but with limited operations.

•         I will be adding support for more operations keeping them aligned to Airavata job submission/monitoring requirements.

•         I am currently focusing on targeting Airavata deployment to Mesos on a single cluster (eg: AWS). The flow would look like follows:

[cid:image001.png@01D230FF.62A722C0]

•         As you can see, currently there is just a single Mesos cluster. The future focus would be to expand this to have multiple clusters.

Subsequent work:

•         Once we are able to test Airavata deployment to single cluster successfully, we can expand this to a multi-cluster environment.

•         Here we would multiple Mesos clusters which would somehow need to be managed. But, the overall flow would look like follows:

[cid:image002.png@01D230FF.62A722C0]



•         We can either have multiple Mesos masters (for each individual cluster), that are connected to each other via VPN, or have a single master – in which case we would need to consider all other nodes as slaves.

•         This is a design issue which needs discussion, and Suresh has some ideas on how to do this.

Thanks and Regards,
Gourav Shenoy

From: Suresh Marru <sm...@apache.org>>
Reply-To: "dev@airavata.apache.org<ma...@airavata.apache.org>" <de...@airavata.apache.org>>
Date: Friday, October 7, 2016 at 11:43 PM
To: Airavata Dev <de...@airavata.apache.org>>
Subject: Re: Mesos based meta-scheduling for Airavata

Hi Gourav,

Thank you for the nice informative summaries, posts like these are always educational. Keep’em coming.

Suresh

On Oct 7, 2016, at 10:56 PM, Shenoy, Gourav Ganesh <go...@indiana.edu>> wrote:

Hi dev,

I have been exploring different frameworks for Mesos which would help our use-case of providing Airavata the capability to run jobs in a Mesos based ecosystem. In particular, I have been playing around with Marathon & Chronos and I am now going to be working on Apache Aurora.

I have summarized my understanding about Mesos, Marathon & Chronos below. I will send out a separate email about Aurora later.

Apache Mesos:

•         Apache Mesos is an open-source cluster manager, in the sense that it helps deploy & manage different frameworks (or applications) in a large clustered environment easily.
•         Mesos provides the ability to utilize underlying shared pool of nodes as a single compute unit – That is, it can run many applications on these nodes efficiently.
•         Mesos uses the concept of “offers” for scheduling and running jobs on the underlying nodes. When a framework (application) wants to run computations/jobs on the cluster, Mesos will decide how many resources it will “offer” that framework based on the availability. The framework will then decide which resources to use from the offer, and subsequently run the computation/job on that resource.
•         In a typical cluster, you will have 3 or more Mesos masters & multiple Mesos slaves. Multiple mesos masters help in providing high availability – if one master goes down, Mesos will reelect a new leader (master) – using Zookeeper.
•         The task mentioned above of providing “offers” to frameworks is done by a master, whereas the slaves are the ones who run these computations.

•         Some additional points:
o    I built a Mesos cluster with 3 masters & 2 slaves on EC2.
o    Each master & slave have 1GB of RAM & 1vCPU with 20GB of disk space.

Marathon:

•         Marathon is considered a framework that runs on top of Mesos. It is a container orchestration platform for Mesos and essentially acts as a service scheduler.
•         It is named “marathon” because it is intended for long running applications. That is, Marathon makes sure that the service it is running never stops – if a service goes down or the slave on which the service is run dies, marathon keeps re-starting it on different slaves.
•         In some sense Marathon is very good for ensuring high availability of services. That is, instead of running services directly on Mesos, run it in Marathon if you never want it to die.
Note: You can decide to run a service on multiple slave nodes and if resources on these slaves are available, Mesos will “offer” them to Marathon.
•         It is called a container orchestration platform because it “launches” these services inside a container – either Docker OR Mesos container.
•         In my opinion it is not a suitable “job scheduler” for Airavata because in Airavata we need to run a job and get the output rather than keeping it running always. Instead, we can run other schedulers – chronos/aurora as a service in Marathon.

Chronos:

•         Chronos is a Cron scheduler for Mesos. It is good for running scheduled jobs – jobs that need to be run for a certain number of times, repeatedly after certain intervals.
•         Chronos also provides the ability to add dependencies between jobs – That is, if a job1 is dependent on another job2 then it will run job1 first and then run job2 after job1 completes. It also builds a Directed Acyclic Graph (DAG) based on these dependencies.
•         Similar to Marathon, Chronos receives “offers” from Mesos master whenever it needs to run a job on Mesos.
•         Again, I found that Chronos does not fit the Airavata use-case since I could not find a way to run one-off jobs via Chronos – you need to specify interval time for Chronos, & Chronos then re-runs the job after that interval is complete (even if you decide to specify num. of repetitions=1).


Some additional points:
•         Marathon & Chronos both have REST API support – eg: you can submit jobs via APIs along with other interactions such as list jobs, etc.
•         I installed Marathon & Chronos frameworks on the Mesos master nodes. This is how their health looks like on the Mesos dashboard:

<image002.png>
                As you can see, there are 3 active tasks running in Chronos & 4 active tasks (long running) in Marathon.

•         I also installed Chronos as a service inside Marathon, and this is how it looks like in the Marathon UI:

<image004.png>
Interestingly, Chronos (as a service in Marathon) was smart enough to identify the jobs submitted via Chronos (as a framework on Mesos) & vice-versa.

•         Also, Mesos dashboard lists the active tasks it is running & details about which slave the task is running on. It also lists Completed tasks. The “Sandbox” gives you access to the stdout/stderr files for the tasks as well as any other directories that were created as part of the task.

<image005.png>

Pardon me for this long email. Next, I will explore Apache Aurora which seems a better fit for Airavata use-case because it provides the features that Chronos supports, as well as can run one-off jobs.

Thanks and Regards,
Gourav Shenoy

From: "Shenoy, Gourav Ganesh" <go...@indiana.edu>>
Reply-To: "dev@airavata.apache.org<ma...@airavata.apache.org>" <de...@airavata.apache.org>>
Date: Friday, September 23, 2016 at 4:43 PM
To: "dev@airavata.apache.org<ma...@airavata.apache.org>" <de...@airavata.apache.org>>
Subject: Mesos based meta-scheduling for Airavata

Hi Dev,

I am working on this project of building a Mesos based meta-scheduler for Airavata, along with Shameera & Mangirish. Here is the jira link:https://issues.apache.org/jira/browse/AIRAVATA-2082.

•         We have identified some tasks that would be needed for achieving this, and at the higher level it would consist of:
1.      Resource provisioning – We need to provision resources on cloud & hpc infrastructures such as EC2, Jetstream, Comet, etc.
2.      Building a cluster – Deploying a Mesos cluster on set of nodes obtained from (1) above for task management.
3.      Selecting a scheduler – We need to investigate the scheduler to use with Mesos cluster. Some of the options are Marathon, Aurora. But we need to find one that suits our needs of running serial as well as parallel (MPI) jobs.
4.      Installing & running applications on this cluster – Once the cluster has been deployed and a scheduler choice made, we need to be able to install and run applications on this cluster using Airavata.

•         Until now we were able to look into the following:
o   Resource provisioning:
•  We explored several options of provisioning resources – using cloud libraries as well as via ansible scripts.
•  We built a OpenStack4J Java module which would provision instances on OpenStack based clouds (eg: Jetstream).
•  We also built a CloudBridge Python module for provisioning EC2 instances on Amazon. CloudBridge can also be used to provision instances on OpenStack
•  We wrote Ansible scripts for bringing up instances on both AWS and OpenStack based clouds.

•  Key Points: CloudBridge, OpenStack4J are powerful libraries for resource provisioning, but currently they do single-instance provisioning, and not support templated boot options such as CloudFormation (for AWS) & Heat (for OpenStack).

o   Building a cluster:
•  We wrote Ansible script for deploying a Mesos-Marathon cluster on a set of nodes. This script will install necessary dependencies such as Zookeeper.
•  We tested this on OpenStack based clouds & on EC2.
•  OpenStack Magnum provides excellent support for doing resource provisioning & deploying mesos cluster, but we are running into some problems while trying it.

o   Installing a scheduler:
•  Our Ansible script is currently installing Marathon as the scheduler on Mesos. We haven’t yet submitted jobs using Marathon.

•         Although not finalized, but we are inclined towards using Ansible approach for the above, as Ansible also provides Python APIs and which will allow us to integrate it with Airavata via Thrift. Hence we will be able to easily invoke the Ansible scripts from code without needing to use the command-line interface.

•         We are also progressively working on some work-items such as:
o   Exploring options to provision and deploy a Mesos-Marathon cluster on HPC systems such as Comet. The challenge would be to use Ansible to provision resources and deploy the cluster. Once we have a cluster, we can try running applications.
o   Exploring different scheduler options for running serial and parallel (MPI) jobs on such heterogeneous clusters.
o   Exploring orchestration options such as OpenStack Heat, AWS CloudFormation, OpenStack Magnum, etc.

Any suggestions and comments are highly appreciated.

Thanks and Regards,
Gourav Shenoy



Re: Mesos based meta-scheduling for Airavata

Posted by Amila Jayasekara <th...@gmail.com>.
Hi Gourav,

These are excellent descriptions, but it would be useful if you can lay out
your findings according to Mark's questions from the other thread. As per
my understanding Mark's question is still not answered (I hope Mark will
agree with me).

Also, I am confused about the terminologies used in these tools.
For example, what is the difference between Aurora tasks and Mesos task and
what is the difference between thermos and mesos task? In fact, what is the
definition of a task in this context? I assume "process" has the standard
definition from OS books (a running program).

In MPI we use "aprun -n 32 -N 16 ./a.out" (Suppose we request 2 nodes).
Here the whole command ("aprun -n 4 -N 2 ./a.out") is the job and "-n"
specifies the number of tasks and "-N" specifies the number of tasks per
node (two nodes comprises (16 * 2 =) 32 tasks). So how do these (MPI) tasks
related to Aurora/Mesos tasks?

Further, the "-N" parameter depends on the type of nodes we are using and
the number of cores a node has. For example, if the resource has 16 cores
per node we will use the above command to run 32 tasks, but if a node has
eight cores, then we will use a command like  "aprun -n 32 -N 8 ./a.out"
(in this case we have to request four nodes). So given a command like "aprun -n
32 ./a.out", is Mesos/Aurora capable of adding "-N" parameter to the
command based on the cluster and types of nodes ?

Thanks
-Thejaka



On Mon, Oct 17, 2016 at 11:32 PM, Shenoy, Gourav Ganesh <
goshenoy@indiana.edu> wrote:

> Hi dev,
>
>
>
> Now that I have been able to get jobs scheduled via Aurora, I thought I
> should summarize my understanding. I would also like to briefly draw out
> the plan which I am working on with respect to using Mesos with Airavata.
>
>
>
> *Apache Aurora:*
>
>
>
> ·         Aurora, similar to Marathon & Chronos, is a service scheduler
> framework for Mesos. It has been built for scheduling long running services
> & cron jobs on Mesos.
>
> ·         The advantage with Aurora (over Marathon & Chronos) is that it
> works well for one-off jobs as well – i.e. If I want to run a job and get
> the output, Aurora is a better fit than Marathon & Chronos, since Marathon
> will never let the job exit (and keep restarting it on slaves) & Chronos is
> ONLY for crons.
>
> ·         Aurora also allows fine grained control of the jobs that need
> to be submitted – the concept of jobs, tasks, processes – a job can consist
> of one or more tasks, and a task can consist of one or more processes.
>
> ·         Aurora manages jobs that are made up of tasks; Mesos manages
> the tasks that consist of processes; Thermos (is the Aurora executor)
> manages the processes.
>
> ·         We can control resource utilization at task level because of
> the above job abstractions that Aurora provides.
>
> ·         Among many other features, a useful one is the resource-quota
> management for users & the ability to support multiple users to run jobs.
>
>
>
> *Current focus:*
>
>
>
> ·         I am currently working on building a Thrift based client for
> Aurora, and have been successful in implementing one, but with limited
> operations.
>
> ·         I will be adding support for more operations keeping them
> aligned to Airavata job submission/monitoring requirements.
>
> ·         I am currently focusing on targeting Airavata deployment to
> Mesos on a single cluster (eg: AWS). The flow would look like follows:
>
> ·         As you can see, currently there is just a single Mesos cluster.
> The future focus would be to expand this to have multiple clusters.
>
>
>
> *Subsequent work:*
>
> ·         Once we are able to test Airavata deployment to single cluster
> successfully, we can expand this to a multi-cluster environment.
>
> ·         Here we would multiple Mesos clusters which would somehow need
> to be managed. But, the overall flow would look like follows:
>
>
>
> ·         We can either have multiple Mesos masters (for each individual
> cluster), that are connected to each other via VPN, or have a single master
> – in which case we would need to consider all other nodes as slaves.
>
> ·         This is a design issue which needs discussion, and Suresh has
> some ideas on how to do this.
>
>
>
> Thanks and Regards,
>
> Gourav Shenoy
>
>
>
> *From: *Suresh Marru <sm...@apache.org>
> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
> *Date: *Friday, October 7, 2016 at 11:43 PM
> *To: *Airavata Dev <de...@airavata.apache.org>
> *Subject: *Re: Mesos based meta-scheduling for Airavata
>
>
>
> Hi Gourav,
>
>
>
> Thank you for the nice informative summaries, posts like these are always
> educational. Keep’em coming.
>
>
>
> Suresh
>
>
>
> On Oct 7, 2016, at 10:56 PM, Shenoy, Gourav Ganesh <go...@indiana.edu>
> wrote:
>
>
>
> Hi dev,
>
>
>
> I have been exploring different frameworks for Mesos which would help our
> use-case of providing Airavata the capability to run jobs in a Mesos based
> ecosystem. In particular, I have been playing around with Marathon &
> Chronos and I am now going to be working on Apache Aurora.
>
>
>
> I have summarized my understanding about Mesos, Marathon & Chronos below.
> I will send out a separate email about Aurora later.
>
>
>
> *Apache Mesos:*
>
>
>
> ·         Apache Mesos is an open-source cluster manager, in the sense
> that it helps deploy & manage different frameworks (or applications) in a
> large clustered environment easily.
>
> ·         Mesos provides the ability to utilize underlying shared pool of
> nodes as a single compute unit – That is, it can run many applications on
> these nodes efficiently.
>
> ·         Mesos uses the concept of “offers” for scheduling and running
> jobs on the underlying nodes. When a framework (application) wants to run
> computations/jobs on the cluster, Mesos will decide how many resources it
> will “offer” that framework based on the availability. The framework will
> then decide which resources to use from the offer, and subsequently run the
> computation/job on that resource.
>
> ·         In a typical cluster, you will have 3 or more Mesos masters &
> multiple Mesos slaves. Multiple mesos masters help in providing high
> availability – if one master goes down, Mesos will reelect a new leader
> (master) – using Zookeeper.
>
> ·         The task mentioned above of providing “offers” to frameworks is
> done by a master, whereas the slaves are the ones who run these
> computations.
>
>
>
> ·         Some additional points:
>
> o    I built a Mesos cluster with 3 masters & 2 slaves on EC2.
>
> o    Each master & slave have 1GB of RAM & 1vCPU with 20GB of disk space.
>
>
>
> *Marathon:*
>
>
>
> ·         Marathon is considered a framework that runs on top of Mesos.
> It is a container orchestration platform for Mesos and essentially acts as
> a service scheduler.
>
> ·         It is named “marathon” because it is intended for long running
> applications. That is, Marathon makes sure that the service it is running
> never stops – if a service goes down or the slave on which the service is
> run dies, marathon keeps re-starting it on different slaves.
>
> ·         In some sense Marathon is very good for ensuring high
> availability of services. That is, instead of running services directly on
> Mesos, run it in Marathon if you never want it to die.
> *Note*: You can decide to run a service on multiple slave nodes and if
> resources on these slaves are available, Mesos will “offer” them to
> Marathon.
>
> ·         It is called a container orchestration platform because it
> “launches” these services inside a container – either Docker OR Mesos
> container.
>
> ·         In my opinion it is not a suitable “job scheduler” for Airavata
> because in Airavata we need to run a job and get the output rather than
> keeping it running always. Instead, we can run other schedulers –
> chronos/aurora as a service in Marathon.
>
>
> *Chronos:*
>
>
>
> ·         Chronos is a Cron scheduler for Mesos. It is good for running
> scheduled jobs – jobs that need to be run for a certain number of times,
> repeatedly after certain intervals.
>
> ·         Chronos also provides the ability to add dependencies between
> jobs – That is, if a job1 is dependent on another job2 then it will run
> job1 first and then run job2 after job1 completes. It also builds a
> Directed Acyclic Graph (DAG) based on these dependencies.
>
> ·         Similar to Marathon, Chronos receives “offers” from Mesos
> master whenever it needs to run a job on Mesos.
>
> ·         Again, I found that Chronos does not fit the Airavata use-case
> since I could not find a way to run one-off jobs via Chronos – you need to
> specify interval time for Chronos, & Chronos then re-runs the job after
> that interval is complete (even if you decide to specify num. of
> repetitions=1).
>
>
>
>
>
> Some additional points:
>
> ·         Marathon & Chronos both have REST API support – eg: you can
> submit jobs via APIs along with other interactions such as list jobs, etc.
>
> ·         I installed Marathon & Chronos frameworks on the Mesos master
> nodes. This is how their health looks like on the Mesos dashboard:
>
>
>
> <image002.png>
>
>                 As you can see, there are 3 active tasks running in
> Chronos & 4 active tasks (long running) in Marathon.
>
>
>
> ·         I also installed Chronos as a service inside Marathon, and this
> is how it looks like in the Marathon UI:
>
>
> <image004.png>
>
> Interestingly, Chronos (as a service in Marathon) was smart enough to
> identify the jobs submitted via Chronos (as a framework on Mesos) &
> vice-versa.
>
>
>
> ·         Also, Mesos dashboard lists the active tasks it is running &
> details about which slave the task is running on. It also lists Completed
> tasks. The “Sandbox” gives you access to the stdout/stderr files for the
> tasks as well as any other directories that were created as part of the
> task.
>
>
> <image005.png>
>
>
>
> Pardon me for this long email. Next, I will explore Apache Aurora which
> seems a better fit for Airavata use-case because it provides the features
> that Chronos supports, as well as can run one-off jobs.
>
>
>
> Thanks and Regards,
>
> Gourav Shenoy
>
>
>
> *From: *"Shenoy, Gourav Ganesh" <go...@indiana.edu>
> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
> *Date: *Friday, September 23, 2016 at 4:43 PM
> *To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
> *Subject: *Mesos based meta-scheduling for Airavata
>
>
>
> Hi Dev,
>
>
>
> I am working on this project of building a Mesos based meta-scheduler for
> Airavata, along with Shameera & Mangirish. Here is the jira link:
> https://issues.apache.org/jira/browse/AIRAVATA-2082.
>
>
>
> ·         We have identified some tasks that would be needed for
> achieving this, and at the higher level it would consist of:
>
> 1.      Resource provisioning – We need to provision resources on cloud &
> hpc infrastructures such as EC2, Jetstream, Comet, etc.
>
> 2.      Building a cluster – Deploying a Mesos cluster on set of nodes
> obtained from (1) above for task management.
>
> 3.      Selecting a scheduler – We need to investigate the scheduler to
> use with Mesos cluster. Some of the options are Marathon, Aurora. But we
> need to find one that suits our needs of running serial as well as parallel
> (MPI) jobs.
>
> 4.      Installing & running applications on this cluster – Once the
> cluster has been deployed and a scheduler choice made, we need to be able
> to install and run applications on this cluster using Airavata.
>
>
>
> ·         Until now we were able to look into the following:
>
> o   Resource provisioning:
>
> §  We explored several options of provisioning resources – using cloud
> libraries as well as via ansible scripts.
>
> §  We built a OpenStack4J Java module which would provision instances on
> OpenStack based clouds (eg: Jetstream).
>
> §  We also built a CloudBridge Python module for provisioning EC2
> instances on Amazon. CloudBridge can also be used to provision instances on
> OpenStack
>
> §  We wrote Ansible scripts for bringing up instances on both AWS and
> OpenStack based clouds.
>
>
>
> §  *Key Points*: CloudBridge, OpenStack4J are powerful libraries for
> resource provisioning, but currently they do single-instance provisioning,
> and not support templated boot options such as CloudFormation (for AWS) &
> Heat (for OpenStack).
>
>
>
> o   Building a cluster:
>
> §  We wrote Ansible script for deploying a Mesos-Marathon cluster on a
> set of nodes. This script will install necessary dependencies such as
> Zookeeper.
>
> §  We tested this on OpenStack based clouds & on EC2.
>
> §  OpenStack Magnum provides excellent support for doing resource
> provisioning & deploying mesos cluster, but we are running into some
> problems while trying it.
>
>
>
> o   Installing a scheduler:
>
> §  Our Ansible script is currently installing Marathon as the scheduler
> on Mesos. We haven’t yet submitted jobs using Marathon.
>
>
>
> ·         Although not finalized, but we are inclined towards using
> Ansible approach for the above, as Ansible also provides Python APIs and
> which will allow us to integrate it with Airavata via Thrift. Hence we will
> be able to easily invoke the Ansible scripts from code without needing to
> use the command-line interface.
>
>
>
> ·         We are also progressively working on some work-items such as:
>
> o   Exploring options to provision and deploy a Mesos-Marathon cluster on
> HPC systems such as Comet. The challenge would be to use Ansible to
> provision resources and deploy the cluster. Once we have a cluster, we can
> try running applications.
>
> o   Exploring different scheduler options for running serial and parallel
> (MPI) jobs on such heterogeneous clusters.
>
> o   Exploring orchestration options such as OpenStack Heat, AWS
> CloudFormation, OpenStack Magnum, etc.
>
>
>
> Any suggestions and comments are highly appreciated.
>
>
>
> Thanks and Regards,
>
> Gourav Shenoy
>
>
>

Re: Mesos based meta-scheduling for Airavata

Posted by "Shenoy, Gourav Ganesh" <go...@indiana.edu>.
Hi dev,

Now that I have been able to get jobs scheduled via Aurora, I thought I should summarize my understanding. I would also like to briefly draw out the plan which I am working on with respect to using Mesos with Airavata.

Apache Aurora:


·         Aurora, similar to Marathon & Chronos, is a service scheduler framework for Mesos. It has been built for scheduling long running services & cron jobs on Mesos.

·         The advantage with Aurora (over Marathon & Chronos) is that it works well for one-off jobs as well – i.e. If I want to run a job and get the output, Aurora is a better fit than Marathon & Chronos, since Marathon will never let the job exit (and keep restarting it on slaves) & Chronos is ONLY for crons.

·         Aurora also allows fine grained control of the jobs that need to be submitted – the concept of jobs, tasks, processes – a job can consist of one or more tasks, and a task can consist of one or more processes.

·         Aurora manages jobs that are made up of tasks; Mesos manages the tasks that consist of processes; Thermos (is the Aurora executor) manages the processes.

·         We can control resource utilization at task level because of the above job abstractions that Aurora provides.

·         Among many other features, a useful one is the resource-quota management for users & the ability to support multiple users to run jobs.

Current focus:


·         I am currently working on building a Thrift based client for Aurora, and have been successful in implementing one, but with limited operations.

·         I will be adding support for more operations keeping them aligned to Airavata job submission/monitoring requirements.

·         I am currently focusing on targeting Airavata deployment to Mesos on a single cluster (eg: AWS). The flow would look like follows:

[cid:image001.png@01D228CE.AF51BFB0]

·         As you can see, currently there is just a single Mesos cluster. The future focus would be to expand this to have multiple clusters.

Subsequent work:

·         Once we are able to test Airavata deployment to single cluster successfully, we can expand this to a multi-cluster environment.

·         Here we would multiple Mesos clusters which would somehow need to be managed. But, the overall flow would look like follows:

[cid:image002.png@01D228CE.AF51BFB0]



·         We can either have multiple Mesos masters (for each individual cluster), that are connected to each other via VPN, or have a single master – in which case we would need to consider all other nodes as slaves.

·         This is a design issue which needs discussion, and Suresh has some ideas on how to do this.

Thanks and Regards,
Gourav Shenoy

From: Suresh Marru <sm...@apache.org>
Reply-To: "dev@airavata.apache.org" <de...@airavata.apache.org>
Date: Friday, October 7, 2016 at 11:43 PM
To: Airavata Dev <de...@airavata.apache.org>
Subject: Re: Mesos based meta-scheduling for Airavata

Hi Gourav,

Thank you for the nice informative summaries, posts like these are always educational. Keep’em coming.

Suresh

On Oct 7, 2016, at 10:56 PM, Shenoy, Gourav Ganesh <go...@indiana.edu>> wrote:

Hi dev,

I have been exploring different frameworks for Mesos which would help our use-case of providing Airavata the capability to run jobs in a Mesos based ecosystem. In particular, I have been playing around with Marathon & Chronos and I am now going to be working on Apache Aurora.

I have summarized my understanding about Mesos, Marathon & Chronos below. I will send out a separate email about Aurora later.

Apache Mesos:

•         Apache Mesos is an open-source cluster manager, in the sense that it helps deploy & manage different frameworks (or applications) in a large clustered environment easily.
•         Mesos provides the ability to utilize underlying shared pool of nodes as a single compute unit – That is, it can run many applications on these nodes efficiently.
•         Mesos uses the concept of “offers” for scheduling and running jobs on the underlying nodes. When a framework (application) wants to run computations/jobs on the cluster, Mesos will decide how many resources it will “offer” that framework based on the availability. The framework will then decide which resources to use from the offer, and subsequently run the computation/job on that resource.
•         In a typical cluster, you will have 3 or more Mesos masters & multiple Mesos slaves. Multiple mesos masters help in providing high availability – if one master goes down, Mesos will reelect a new leader (master) – using Zookeeper.
•         The task mentioned above of providing “offers” to frameworks is done by a master, whereas the slaves are the ones who run these computations.

•         Some additional points:
o    I built a Mesos cluster with 3 masters & 2 slaves on EC2.
o    Each master & slave have 1GB of RAM & 1vCPU with 20GB of disk space.

Marathon:

•         Marathon is considered a framework that runs on top of Mesos. It is a container orchestration platform for Mesos and essentially acts as a service scheduler.
•         It is named “marathon” because it is intended for long running applications. That is, Marathon makes sure that the service it is running never stops – if a service goes down or the slave on which the service is run dies, marathon keeps re-starting it on different slaves.
•         In some sense Marathon is very good for ensuring high availability of services. That is, instead of running services directly on Mesos, run it in Marathon if you never want it to die.
Note: You can decide to run a service on multiple slave nodes and if resources on these slaves are available, Mesos will “offer” them to Marathon.
•         It is called a container orchestration platform because it “launches” these services inside a container – either Docker OR Mesos container.
•         In my opinion it is not a suitable “job scheduler” for Airavata because in Airavata we need to run a job and get the output rather than keeping it running always. Instead, we can run other schedulers – chronos/aurora as a service in Marathon.


Chronos:

•         Chronos is a Cron scheduler for Mesos. It is good for running scheduled jobs – jobs that need to be run for a certain number of times, repeatedly after certain intervals.
•         Chronos also provides the ability to add dependencies between jobs – That is, if a job1 is dependent on another job2 then it will run job1 first and then run job2 after job1 completes. It also builds a Directed Acyclic Graph (DAG) based on these dependencies.
•         Similar to Marathon, Chronos receives “offers” from Mesos master whenever it needs to run a job on Mesos.
•         Again, I found that Chronos does not fit the Airavata use-case since I could not find a way to run one-off jobs via Chronos – you need to specify interval time for Chronos, & Chronos then re-runs the job after that interval is complete (even if you decide to specify num. of repetitions=1).


Some additional points:
•         Marathon & Chronos both have REST API support – eg: you can submit jobs via APIs along with other interactions such as list jobs, etc.
•         I installed Marathon & Chronos frameworks on the Mesos master nodes. This is how their health looks like on the Mesos dashboard:

<image002.png>
                As you can see, there are 3 active tasks running in Chronos & 4 active tasks (long running) in Marathon.

•         I also installed Chronos as a service inside Marathon, and this is how it looks like in the Marathon UI:


<image004.png>
Interestingly, Chronos (as a service in Marathon) was smart enough to identify the jobs submitted via Chronos (as a framework on Mesos) & vice-versa.

•         Also, Mesos dashboard lists the active tasks it is running & details about which slave the task is running on. It also lists Completed tasks. The “Sandbox” gives you access to the stdout/stderr files for the tasks as well as any other directories that were created as part of the task.


<image005.png>

Pardon me for this long email. Next, I will explore Apache Aurora which seems a better fit for Airavata use-case because it provides the features that Chronos supports, as well as can run one-off jobs.

Thanks and Regards,
Gourav Shenoy

From: "Shenoy, Gourav Ganesh" <go...@indiana.edu>>
Reply-To: "dev@airavata.apache.org<ma...@airavata.apache.org>" <de...@airavata.apache.org>>
Date: Friday, September 23, 2016 at 4:43 PM
To: "dev@airavata.apache.org<ma...@airavata.apache.org>" <de...@airavata.apache.org>>
Subject: Mesos based meta-scheduling for Airavata

Hi Dev,

I am working on this project of building a Mesos based meta-scheduler for Airavata, along with Shameera & Mangirish. Here is the jira link:https://issues.apache.org/jira/browse/AIRAVATA-2082.

•         We have identified some tasks that would be needed for achieving this, and at the higher level it would consist of:
1.      Resource provisioning – We need to provision resources on cloud & hpc infrastructures such as EC2, Jetstream, Comet, etc.
2.      Building a cluster – Deploying a Mesos cluster on set of nodes obtained from (1) above for task management.
3.      Selecting a scheduler – We need to investigate the scheduler to use with Mesos cluster. Some of the options are Marathon, Aurora. But we need to find one that suits our needs of running serial as well as parallel (MPI) jobs.
4.      Installing & running applications on this cluster – Once the cluster has been deployed and a scheduler choice made, we need to be able to install and run applications on this cluster using Airavata.

•         Until now we were able to look into the following:
o   Resource provisioning:
•  We explored several options of provisioning resources – using cloud libraries as well as via ansible scripts.
•  We built a OpenStack4J Java module which would provision instances on OpenStack based clouds (eg: Jetstream).
•  We also built a CloudBridge Python module for provisioning EC2 instances on Amazon. CloudBridge can also be used to provision instances on OpenStack
•  We wrote Ansible scripts for bringing up instances on both AWS and OpenStack based clouds.

•  Key Points: CloudBridge, OpenStack4J are powerful libraries for resource provisioning, but currently they do single-instance provisioning, and not support templated boot options such as CloudFormation (for AWS) & Heat (for OpenStack).

o   Building a cluster:
•  We wrote Ansible script for deploying a Mesos-Marathon cluster on a set of nodes. This script will install necessary dependencies such as Zookeeper.
•  We tested this on OpenStack based clouds & on EC2.
•  OpenStack Magnum provides excellent support for doing resource provisioning & deploying mesos cluster, but we are running into some problems while trying it.

o   Installing a scheduler:
•  Our Ansible script is currently installing Marathon as the scheduler on Mesos. We haven’t yet submitted jobs using Marathon.

•         Although not finalized, but we are inclined towards using Ansible approach for the above, as Ansible also provides Python APIs and which will allow us to integrate it with Airavata via Thrift. Hence we will be able to easily invoke the Ansible scripts from code without needing to use the command-line interface.

•         We are also progressively working on some work-items such as:
o   Exploring options to provision and deploy a Mesos-Marathon cluster on HPC systems such as Comet. The challenge would be to use Ansible to provision resources and deploy the cluster. Once we have a cluster, we can try running applications.
o   Exploring different scheduler options for running serial and parallel (MPI) jobs on such heterogeneous clusters.
o   Exploring orchestration options such as OpenStack Heat, AWS CloudFormation, OpenStack Magnum, etc.

Any suggestions and comments are highly appreciated.

Thanks and Regards,
Gourav Shenoy


Re: Mesos based meta-scheduling for Airavata

Posted by Suresh Marru <sm...@apache.org>.
Hi Gourav,

Thank you for the nice informative summaries, posts like these are always educational. Keep’em coming. 

Suresh

> On Oct 7, 2016, at 10:56 PM, Shenoy, Gourav Ganesh <go...@indiana.edu> wrote:
> 
> Hi dev,
>  
> I have been exploring different frameworks for Mesos which would help our use-case of providing Airavata the capability to run jobs in a Mesos based ecosystem. In particular, I have been playing around with Marathon & Chronos and I am now going to be working on Apache Aurora. 
>  
> I have summarized my understanding about Mesos, Marathon & Chronos below. I will send out a separate email about Aurora later.
>  
> Apache Mesos:
>  
> ·         Apache Mesos is an open-source cluster manager, in the sense that it helps deploy & manage different frameworks (or applications) in a large clustered environment easily.
> ·         Mesos provides the ability to utilize underlying shared pool of nodes as a single compute unit – That is, it can run many applications on these nodes efficiently.
> ·         Mesos uses the concept of “offers” for scheduling and running jobs on the underlying nodes. When a framework (application) wants to run computations/jobs on the cluster, Mesos will decide how many resources it will “offer” that framework based on the availability. The framework will then decide which resources to use from the offer, and subsequently run the computation/job on that resource.
> ·         In a typical cluster, you will have 3 or more Mesos masters & multiple Mesos slaves. Multiple mesos masters help in providing high availability – if one master goes down, Mesos will reelect a new leader (master) – using Zookeeper.
> ·         The task mentioned above of providing “offers” to frameworks is done by a master, whereas the slaves are the ones who run these computations.
>  
> ·         Some additional points:
> o    I built a Mesos cluster with 3 masters & 2 slaves on EC2.
> o    Each master & slave have 1GB of RAM & 1vCPU with 20GB of disk space.
>  
> Marathon:
>  
> ·         Marathon is considered a framework that runs on top of Mesos. It is a container orchestration platform for Mesos and essentially acts as a service scheduler.
> ·         It is named “marathon” because it is intended for long running applications. That is, Marathon makes sure that the service it is running never stops – if a service goes down or the slave on which the service is run dies, marathon keeps re-starting it on different slaves. 
> ·         In some sense Marathon is very good for ensuring high availability of services. That is, instead of running services directly on Mesos, run it in Marathon if you never want it to die.
> Note: You can decide to run a service on multiple slave nodes and if resources on these slaves are available, Mesos will “offer” them to Marathon.
> ·         It is called a container orchestration platform because it “launches” these services inside a container – either Docker OR Mesos container.
> ·         In my opinion it is not a suitable “job scheduler” for Airavata because in Airavata we need to run a job and get the output rather than keeping it running always. Instead, we can run other schedulers – chronos/aurora as a service in Marathon.
> 
> Chronos:
>  
> ·         Chronos is a Cron scheduler for Mesos. It is good for running scheduled jobs – jobs that need to be run for a certain number of times, repeatedly after certain intervals.
> ·         Chronos also provides the ability to add dependencies between jobs – That is, if a job1 is dependent on another job2 then it will run job1 first and then run job2 after job1 completes. It also builds a Directed Acyclic Graph (DAG) based on these dependencies.
> ·         Similar to Marathon, Chronos receives “offers” from Mesos master whenever it needs to run a job on Mesos.
> ·         Again, I found that Chronos does not fit the Airavata use-case since I could not find a way to run one-off jobs via Chronos – you need to specify interval time for Chronos, & Chronos then re-runs the job after that interval is complete (even if you decide to specify num. of repetitions=1).
>  
>  
> Some additional points:
> ·         Marathon & Chronos both have REST API support – eg: you can submit jobs via APIs along with other interactions such as list jobs, etc.
> ·         I installed Marathon & Chronos frameworks on the Mesos master nodes. This is how their health looks like on the Mesos dashboard:
>  
> <image002.png>
>                 As you can see, there are 3 active tasks running in Chronos & 4 active tasks (long running) in Marathon.
>  
> ·         I also installed Chronos as a service inside Marathon, and this is how it looks like in the Marathon UI:
> 
> <image004.png>
> Interestingly, Chronos (as a service in Marathon) was smart enough to identify the jobs submitted via Chronos (as a framework on Mesos) & vice-versa.
>  
> ·         Also, Mesos dashboard lists the active tasks it is running & details about which slave the task is running on. It also lists Completed tasks. The “Sandbox” gives you access to the stdout/stderr files for the tasks as well as any other directories that were created as part of the task.
> 
> <image005.png>
>  
> Pardon me for this long email. Next, I will explore Apache Aurora which seems a better fit for Airavata use-case because it provides the features that Chronos supports, as well as can run one-off jobs.
>  
> Thanks and Regards,
> Gourav Shenoy
>  
> From: "Shenoy, Gourav Ganesh" <go...@indiana.edu>
> Reply-To: "dev@airavata.apache.org" <de...@airavata.apache.org>
> Date: Friday, September 23, 2016 at 4:43 PM
> To: "dev@airavata.apache.org" <de...@airavata.apache.org>
> Subject: Mesos based meta-scheduling for Airavata
>  
> Hi Dev,
>  
> I am working on this project of building a Mesos based meta-scheduler for Airavata, along with Shameera & Mangirish. Here is the jira link:https://issues.apache.org/jira/browse/AIRAVATA-2082 <https://issues.apache.org/jira/browse/AIRAVATA-2082>.
>  
> ·         We have identified some tasks that would be needed for achieving this, and at the higher level it would consist of:
> 1.      Resource provisioning – We need to provision resources on cloud & hpc infrastructures such as EC2, Jetstream, Comet, etc.
> 2.      Building a cluster – Deploying a Mesos cluster on set of nodes obtained from (1) above for task management.
> 3.      Selecting a scheduler – We need to investigate the scheduler to use with Mesos cluster. Some of the options are Marathon, Aurora. But we need to find one that suits our needs of running serial as well as parallel (MPI) jobs.
> 4.      Installing & running applications on this cluster – Once the cluster has been deployed and a scheduler choice made, we need to be able to install and run applications on this cluster using Airavata.
>  
> ·         Until now we were able to look into the following:
> o   Resource provisioning:
> §  We explored several options of provisioning resources – using cloud libraries as well as via ansible scripts.
> §  We built a OpenStack4J Java module which would provision instances on OpenStack based clouds (eg: Jetstream).
> §  We also built a CloudBridge Python module for provisioning EC2 instances on Amazon. CloudBridge can also be used to provision instances on OpenStack
> §  We wrote Ansible scripts for bringing up instances on both AWS and OpenStack based clouds.
>  
> §  Key Points: CloudBridge, OpenStack4J are powerful libraries for resource provisioning, but currently they do single-instance provisioning, and not support templated boot options such as CloudFormation (for AWS) & Heat (for OpenStack).
>  
> o   Building a cluster:
> §  We wrote Ansible script for deploying a Mesos-Marathon cluster on a set of nodes. This script will install necessary dependencies such as Zookeeper.
> §  We tested this on OpenStack based clouds & on EC2.
> §  OpenStack Magnum provides excellent support for doing resource provisioning & deploying mesos cluster, but we are running into some problems while trying it.
>  
> o   Installing a scheduler:
> §  Our Ansible script is currently installing Marathon as the scheduler on Mesos. We haven’t yet submitted jobs using Marathon.
>  
> ·         Although not finalized, but we are inclined towards using Ansible approach for the above, as Ansible also provides Python APIs and which will allow us to integrate it with Airavata via Thrift. Hence we will be able to easily invoke the Ansible scripts from code without needing to use the command-line interface.
>  
> ·         We are also progressively working on some work-items such as:
> o   Exploring options to provision and deploy a Mesos-Marathon cluster on HPC systems such as Comet. The challenge would be to use Ansible to provision resources and deploy the cluster. Once we have a cluster, we can try running applications.
> o   Exploring different scheduler options for running serial and parallel (MPI) jobs on such heterogeneous clusters.
> o   Exploring orchestration options such as OpenStack Heat, AWS CloudFormation, OpenStack Magnum, etc.
>  
> Any suggestions and comments are highly appreciated.
>  
> Thanks and Regards,
> Gourav Shenoy