You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@airavata.apache.org by "Shenoy, Gourav Ganesh" <go...@indiana.edu> on 2016/10/15 03:04:38 UTC

Apache Aurora Scheduler APIs (Thrift)

Hi dev,

I am working with building a Thrift client for Apache Aurora Scheduler running on a Mesos cluster. Apparently, the Apache Aurora documentation provided very little information about the Thrift APIs that Aurora exposed. One way to get to know what services are exposed - is by going through the "api.thrift" file on Aurora github (https://github.com/apache/aurora/blob/master/api/src/main/thrift/org/apache/aurora/gen/api.thrift).  Reading through that file to figure out the APIs can be daunting.

I have installed Aurora on a Mesos cluster on EC2 to carry out tests, and the UI dashboard for Aurora provides a wide range of useful information. On the dashboard they have provided a link "Scheduler API" which gives a comprehensive list of all Thrift services/APIs that the Aurora scheduler exposes. I think this is very useful for anyone who plans to write a client.

I have taken a dump of this html and loaded it on S3: https://s3-us-west-2.amazonaws.com/apache-aurora/thrift_module_api.htm for reference.

Snapshot:
[cid:image001.png@01D2266F.547FE960]

Thanks and Regards,
Gourav Shenoy

Re: Apache Aurora Scheduler APIs (Thrift)

Posted by "Shenoy, Gourav Ganesh" <go...@indiana.edu>.

Hi devs,

I have created a pull request for the Aurora Thrift client (java) implementation. Here is the link to the PR: https://github.com/apache/airavata/pull/62

Suresh, Shameera: Kindly review the PR and let me know if there are any changes needed.

Thanks and Regards,
Gourav Shenoy

From: "Shenoy, Gourav Ganesh" <go...@indiana.edu>
Reply-To: "dev@airavata.apache.org" <de...@airavata.apache.org>
Date: Wednesday, October 19, 2016 at 2:08 PM
To: "dev@airavata.apache.org" <de...@airavata.apache.org>
Subject: Re: Apache Aurora Scheduler APIs (Thrift)

Hi Mangirish,

I have already shared the details of this setup with you. This cluster is the same one I used to test Mesos-Marathon-Chronos. Let me know if you face any problems or have any questions.

I have setup an aurora client machine for running command-line tests, and you can also install aurora-cli on your local machine if needed. I will share client machine details with you on HipChat.

Thanks and Regards,
Gourav Shenoy

From: Mangirish Wagle <va...@gmail.com>
Reply-To: "dev@airavata.apache.org" <de...@airavata.apache.org>
Date: Tuesday, October 18, 2016 at 2:02 AM
To: "dev@airavata.apache.org" <de...@airavata.apache.org>
Subject: Re: Apache Aurora Scheduler APIs (Thrift)

Hi Gourav,
This is great. Can I get access to this setup to see if I can potentially submit a 1 node MPI job through this setup?
I would also want to try out submitting jobs in batch, which may be potentially used for "gang scheduling" to run MPI jobs.
Thanks and Regards,
Mangirish

On Mon, Oct 17, 2016 at 9:39 PM, Shenoy, Gourav Ganesh <go...@indiana.edu>> wrote:
Hi dev,

I was able to successfully build a “test” Thrift client for the Apache Aurora scheduler running on the Mesos cluster I deployed (on Ec2). I call it a “test” client since it is not completely ready, and right now only performs the following operations:

1.       Submit a one-off job to Aurora scheduler.

2.       Monitor the status of the job submitted – the thrift apis allow us to also check if there are any pending jobs, and what is the reason for it being in PENDING state. This helps us to know if there are insufficient resources (eg: CPUs) and provision new ones if needed.

3.       Retrieve list of running jobs.

Some details:

•         About the thrift client

o    I cloned the Apache Aurora repository and it contained the "api.thrift" file, which contained the RPC structures we need for the client.

o    I generated client stubs from this "api.thrift" file. I used the "thrift-maven plugin" for generating the Java classes; With this plugin, it directly creates a JAR with all thrift-generated-classes, and this can be used as a library/dependency in our client project.

o    I initially tried connecting to the scheduler via "TSocket" transport connection, and spent a lot of time figuring out why this failed. Apparently, the current installation of Aurora only exposes an HTTP client (at port 8081).

o    I had to use a "THTTPClient" (instead of TSocket), and use TJSONProtocol (instead of TBinaryProtocol). But I will be dropping an email in the Aurora mailing list to find out how to enable binary socket connection.

•         About operations implemented

o    Submit a one-off job

•  I was able to submit a job to Aurora, which then schedules it to run on Mesos.

•  A job in Aurora is uniquely identified by 3 parameters (collectively called as Job Key) – environment name (eg: devel), role (eg: centos), job name (eg: hello_world).

•  A typical Job would look like: "example/centos/devel/hello_world", where example is the name of our Mesos cluster.

•  To submit a job, we need to know the resources it needs (cpus, ram, disk), and include it in a task config – which will also contain the command to run the application along with other details.

•  The job submitted via the thrift client was running successfully on Mesos.

o    Monitor status of job submitted

•  I submitted 2 jobs – one with sufficient resource requirements, and another with a larger resource requirement (which is insufficient on Mesos).

•  The first job ran fine, whereas the second couldn’t be scheduled since there were insufficient resources.

•  I was able to get the status of the active job, and also the status of the PENDING job, with reason for why it is PENDING. The response received for the PENDING job is:
PendingReason(taskId:centos-devel-hello_pending-0-1cabf9d3-d315-4bd9-bf1c-8121f4801084, reason:Insufficient: CPU)

o    Retrieve a list of running jobs

•  The response contains a rich amount of information about the job.

•  Sample parsed response:
# instanceCount: 1
   >> Job Key <<
         # name: hello_world
         # role: centos
         # environment: devel
   >> Identity <<
         # owner: centos
   >> Task Config <<
         # numCPUs: 0.1
         # diskMb: 8
         # ramMb: 1

# priority: 0

Next Steps:

•         Complete implementation for all functions relevant to Airavata job submission/monitoring.

•         Dynamically add slaves based on health of jobs/cluster.

•         Find out how to enable socket based communication using binary protocol with Aurora Scheduler on our cluster.

Thanks and Regards,
Gourav Shenoy

From: "Shenoy, Gourav Ganesh" <go...@indiana.edu>>
Reply-To: "dev@airavata.apache.org<ma...@airavata.apache.org>" <de...@airavata.apache.org>>
Date: Friday, October 14, 2016 at 11:04 PM
To: "dev@airavata.apache.org<ma...@airavata.apache.org>" <de...@airavata.apache.org>>
Subject: Apache Aurora Scheduler APIs (Thrift)

Hi dev,

I am working with building a Thrift client for Apache Aurora Scheduler running on a Mesos cluster. Apparently, the Apache Aurora documentation provided very little information about the Thrift APIs that Aurora exposed. One way to get to know what services are exposed - is by going through the "api.thrift" file on Aurora github (https://github.com/apache/aurora/blob/master/api/src/main/thrift/org/apache/aurora/gen/api.thrift).  Reading through that file to figure out the APIs can be daunting.

I have installed Aurora on a Mesos cluster on EC2 to carry out tests, and the UI dashboard for Aurora provides a wide range of useful information. On the dashboard they have provided a link "Scheduler API" which gives a comprehensive list of all Thrift services/APIs that the Aurora scheduler exposes. I think this is very useful for anyone who plans to write a client.

I have taken a dump of this html and loaded it on S3: https://s3-us-west-2.amazonaws.com/apache-aurora/thrift_module_api.htm for reference.

Snapshot:
[cid:image001.png@01D22D34.CA7CC6D0]

Thanks and Regards,
Gourav Shenoy

Re: Apache Aurora Scheduler APIs (Thrift)

Posted by "Shenoy, Gourav Ganesh" <go...@indiana.edu>.

Hi Mangirish,

I have already shared the details of this setup with you. This cluster is the same one I used to test Mesos-Marathon-Chronos. Let me know if you face any problems or have any questions.

I have setup an aurora client machine for running command-line tests, and you can also install aurora-cli on your local machine if needed. I will share client machine details with you on HipChat.

Thanks and Regards,
Gourav Shenoy

From: Mangirish Wagle <va...@gmail.com>
Reply-To: "dev@airavata.apache.org" <de...@airavata.apache.org>
Date: Tuesday, October 18, 2016 at 2:02 AM
To: "dev@airavata.apache.org" <de...@airavata.apache.org>
Subject: Re: Apache Aurora Scheduler APIs (Thrift)

Hi Gourav,
This is great. Can I get access to this setup to see if I can potentially submit a 1 node MPI job through this setup?
I would also want to try out submitting jobs in batch, which may be potentially used for "gang scheduling" to run MPI jobs.
Thanks and Regards,
Mangirish

On Mon, Oct 17, 2016 at 9:39 PM, Shenoy, Gourav Ganesh <go...@indiana.edu>> wrote:
Hi dev,

I was able to successfully build a “test” Thrift client for the Apache Aurora scheduler running on the Mesos cluster I deployed (on Ec2). I call it a “test” client since it is not completely ready, and right now only performs the following operations:

1.       Submit a one-off job to Aurora scheduler.

2.       Monitor the status of the job submitted – the thrift apis allow us to also check if there are any pending jobs, and what is the reason for it being in PENDING state. This helps us to know if there are insufficient resources (eg: CPUs) and provision new ones if needed.

3.       Retrieve list of running jobs.

Some details:

•         About the thrift client

o    I cloned the Apache Aurora repository and it contained the "api.thrift" file, which contained the RPC structures we need for the client.

o    I generated client stubs from this "api.thrift" file. I used the "thrift-maven plugin" for generating the Java classes; With this plugin, it directly creates a JAR with all thrift-generated-classes, and this can be used as a library/dependency in our client project.

o    I initially tried connecting to the scheduler via "TSocket" transport connection, and spent a lot of time figuring out why this failed. Apparently, the current installation of Aurora only exposes an HTTP client (at port 8081).

o    I had to use a "THTTPClient" (instead of TSocket), and use TJSONProtocol (instead of TBinaryProtocol). But I will be dropping an email in the Aurora mailing list to find out how to enable binary socket connection.

•         About operations implemented

o    Submit a one-off job

•  I was able to submit a job to Aurora, which then schedules it to run on Mesos.

•  A job in Aurora is uniquely identified by 3 parameters (collectively called as Job Key) – environment name (eg: devel), role (eg: centos), job name (eg: hello_world).

•  A typical Job would look like: "example/centos/devel/hello_world", where example is the name of our Mesos cluster.

•  To submit a job, we need to know the resources it needs (cpus, ram, disk), and include it in a task config – which will also contain the command to run the application along with other details.

•  The job submitted via the thrift client was running successfully on Mesos.

o    Monitor status of job submitted

•  I submitted 2 jobs – one with sufficient resource requirements, and another with a larger resource requirement (which is insufficient on Mesos).

•  The first job ran fine, whereas the second couldn’t be scheduled since there were insufficient resources.

•  I was able to get the status of the active job, and also the status of the PENDING job, with reason for why it is PENDING. The response received for the PENDING job is:
PendingReason(taskId:centos-devel-hello_pending-0-1cabf9d3-d315-4bd9-bf1c-8121f4801084, reason:Insufficient: CPU)

o    Retrieve a list of running jobs

•  The response contains a rich amount of information about the job.

•  Sample parsed response:
# instanceCount: 1
   >> Job Key <<
         # name: hello_world
         # role: centos
         # environment: devel
   >> Identity <<
         # owner: centos
   >> Task Config <<
         # numCPUs: 0.1
         # diskMb: 8
         # ramMb: 1

# priority: 0

Next Steps:

•         Complete implementation for all functions relevant to Airavata job submission/monitoring.

•         Dynamically add slaves based on health of jobs/cluster.

•         Find out how to enable socket based communication using binary protocol with Aurora Scheduler on our cluster.

Thanks and Regards,
Gourav Shenoy

From: "Shenoy, Gourav Ganesh" <go...@indiana.edu>>
Reply-To: "dev@airavata.apache.org<ma...@airavata.apache.org>" <de...@airavata.apache.org>>
Date: Friday, October 14, 2016 at 11:04 PM
To: "dev@airavata.apache.org<ma...@airavata.apache.org>" <de...@airavata.apache.org>>
Subject: Apache Aurora Scheduler APIs (Thrift)

Hi dev,

I am working with building a Thrift client for Apache Aurora Scheduler running on a Mesos cluster. Apparently, the Apache Aurora documentation provided very little information about the Thrift APIs that Aurora exposed. One way to get to know what services are exposed - is by going through the "api.thrift" file on Aurora github (https://github.com/apache/aurora/blob/master/api/src/main/thrift/org/apache/aurora/gen/api.thrift).  Reading through that file to figure out the APIs can be daunting.

I have installed Aurora on a Mesos cluster on EC2 to carry out tests, and the UI dashboard for Aurora provides a wide range of useful information. On the dashboard they have provided a link "Scheduler API" which gives a comprehensive list of all Thrift services/APIs that the Aurora scheduler exposes. I think this is very useful for anyone who plans to write a client.

I have taken a dump of this html and loaded it on S3: https://s3-us-west-2.amazonaws.com/apache-aurora/thrift_module_api.htm for reference.

Snapshot:
[cid:image001.png@01D22A12.37D96E20]

Thanks and Regards,
Gourav Shenoy

Re: Apache Aurora Scheduler APIs (Thrift)

Posted by Mangirish Wagle <va...@gmail.com>.

Hi Gourav,

This is great. Can I get access to this setup to see if I can potentially
submit a 1 node MPI job through this setup?
I would also want to try out submitting jobs in batch, which may be
potentially used for "gang scheduling" to run MPI jobs.

Thanks and Regards,
Mangirish

On Mon, Oct 17, 2016 at 9:39 PM, Shenoy, Gourav Ganesh <goshenoy@indiana.edu
> wrote:

> Hi dev,
>
>
>
> I was able to successfully build a “test” Thrift client for the Apache
> Aurora scheduler running on the Mesos cluster I deployed (on Ec2). I call
> it a “test” client since it is not completely ready, and right now only
> performs the following operations:
>
> 1.       Submit a one-off job to Aurora scheduler.
>
> 2.       Monitor the status of the job submitted – the thrift apis allow
> us to also check if there are any pending jobs, and what is the reason for
> it being in PENDING state. This helps us to know if there are insufficient
> resources (eg: CPUs) and provision new ones if needed.
>
> 3.       Retrieve list of running jobs.
>
>
>
> *Some details:*
>
>
>
> ·         *About the thrift client*
>
> o    I cloned the Apache Aurora repository and it contained the
> "api.thrift" file, which contained the RPC structures we need for the
> client.
>
> o    I generated client stubs from this "api.thrift" file. I used the
> "thrift-maven plugin" for generating the Java classes; With this plugin, it
> directly creates a JAR with all thrift-generated-classes, and this can be
> used as a library/dependency in our client project.
>
> o    I initially tried connecting to the scheduler via "TSocket"
> transport connection, and spent a lot of time figuring out why this failed.
> Apparently, the current installation of Aurora only exposes an HTTP client
> (at port 8081).
>
> o    I had to use a "THTTPClient" (instead of TSocket), and use
> TJSONProtocol (instead of TBinaryProtocol). But I will be dropping an email
> in the Aurora mailing list to find out how to enable binary socket
> connection.
>
>
>
> ·         *About operations implemented*
>
>
>
> o    Submit a one-off job
>
> §  I was able to submit a job to Aurora, which then schedules it to run
> on Mesos.
>
> §  A job in Aurora is uniquely identified by 3 parameters (collectively
> called as Job Key) – environment name (eg: devel), role (eg: centos), job
> name (eg: hello_world).
>
> §  A typical Job would look like: "example/centos/devel/hello_world",
> where example is the name of our Mesos cluster.
>
> §  To submit a job, we need to know the resources it needs (cpus, ram,
> disk), and include it in a task config – which will also contain the
> command to run the application along with other details.
>
> §  The job submitted via the thrift client was running successfully on
> Mesos.
>
>
>
> o    Monitor status of job submitted
>
> §  I submitted 2 jobs – one with sufficient resource requirements, and
> another with a larger resource requirement (which is insufficient on Mesos).
>
> §  The first job ran fine, whereas the second couldn’t be scheduled since
> there were insufficient resources.
>
> §  I was able to get the status of the active job, and also the status of
> the PENDING job, with reason for why it is PENDING. The response received
> for the PENDING job is:
> PendingReason(taskId:centos-devel-hello_pending-0-1cabf9d3-d315-4bd9-bf1c-8121f4801084,
> *reason:Insufficient: CPU*)
>
>
>
> o    Retrieve a list of running jobs
>
> §  The response contains a rich amount of information about the job.
>
> §  Sample parsed response:
>
> # instanceCount: 1
>
>    >> Job Key <<
>
>          # name: hello_world
>
>          # role: centos
>
>          # environment: devel
>
>    >> Identity <<
>
>          # owner: centos
>
>    >> Task Config <<
>
>          # numCPUs: 0.1
>
>          # diskMb: 8
>
>          # ramMb: 1
>
> # priority: 0
>
>
>
> *Next Steps:*
>
>
>
> ·         Complete implementation for all functions relevant to Airavata
> job submission/monitoring.
>
> ·         Dynamically add slaves based on health of jobs/cluster.
>
> ·         Find out how to enable socket based communication using binary
> protocol with Aurora Scheduler on our cluster.
>
>
>
> Thanks and Regards,
>
> Gourav Shenoy
>
>
>
> *From: *"Shenoy, Gourav Ganesh" <go...@indiana.edu>
> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
> *Date: *Friday, October 14, 2016 at 11:04 PM
> *To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
> *Subject: *Apache Aurora Scheduler APIs (Thrift)
>
>
>
> Hi dev,
>
>
>
> I am working with building a Thrift client for Apache Aurora Scheduler
> running on a Mesos cluster. Apparently, the Apache Aurora documentation
> provided very little information about the Thrift APIs that Aurora exposed.
> One way to get to know what services are exposed - is by going through the
> "api.thrift" file on Aurora github (https://github.com/apache/
> aurora/blob/master/api/src/main/thrift/org/apache/aurora/gen/api.thrift).
> Reading through that file to figure out the APIs can be daunting.
>
>
>
> I have installed Aurora on a Mesos cluster on EC2 to carry out tests, and
> the UI dashboard for Aurora provides a wide range of useful information. On
> the dashboard they have provided a link "Scheduler API" which gives a
> comprehensive list of all Thrift services/APIs that the Aurora scheduler
> exposes. I think this is very useful for anyone who plans to write a
> client.
>
>
>
> I have taken a dump of this html and loaded it on S3:
> https://s3-us-west-2.amazonaws.com/apache-aurora/thrift_module_api.htm
> for reference.
>
>
>
> Snapshot:
>
>
>
> Thanks and Regards,
>
> Gourav Shenoy
>
>
>
>
>

Re: Apache Aurora Scheduler APIs (Thrift)

Posted by "Shenoy, Gourav Ganesh" <go...@indiana.edu>.

Hi dev,

I was able to successfully build a “test” Thrift client for the Apache Aurora scheduler running on the Mesos cluster I deployed (on Ec2). I call it a “test” client since it is not completely ready, and right now only performs the following operations:

1.       Submit a one-off job to Aurora scheduler.

2.       Monitor the status of the job submitted – the thrift apis allow us to also check if there are any pending jobs, and what is the reason for it being in PENDING state. This helps us to know if there are insufficient resources (eg: CPUs) and provision new ones if needed.

3.       Retrieve list of running jobs.

Some details:


·         About the thrift client

o    I cloned the Apache Aurora repository and it contained the "api.thrift" file, which contained the RPC structures we need for the client.

o    I generated client stubs from this "api.thrift" file. I used the "thrift-maven plugin" for generating the Java classes; With this plugin, it directly creates a JAR with all thrift-generated-classes, and this can be used as a library/dependency in our client project.

o    I initially tried connecting to the scheduler via "TSocket" transport connection, and spent a lot of time figuring out why this failed. Apparently, the current installation of Aurora only exposes an HTTP client (at port 8081).

o    I had to use a "THTTPClient" (instead of TSocket), and use TJSONProtocol (instead of TBinaryProtocol). But I will be dropping an email in the Aurora mailing list to find out how to enable binary socket connection.


·         About operations implemented


o    Submit a one-off job

§  I was able to submit a job to Aurora, which then schedules it to run on Mesos.

§  A job in Aurora is uniquely identified by 3 parameters (collectively called as Job Key) – environment name (eg: devel), role (eg: centos), job name (eg: hello_world).

§  A typical Job would look like: "example/centos/devel/hello_world", where example is the name of our Mesos cluster.

§  To submit a job, we need to know the resources it needs (cpus, ram, disk), and include it in a task config – which will also contain the command to run the application along with other details.

§  The job submitted via the thrift client was running successfully on Mesos.


o    Monitor status of job submitted

§  I submitted 2 jobs – one with sufficient resource requirements, and another with a larger resource requirement (which is insufficient on Mesos).

§  The first job ran fine, whereas the second couldn’t be scheduled since there were insufficient resources.

§  I was able to get the status of the active job, and also the status of the PENDING job, with reason for why it is PENDING. The response received for the PENDING job is:
PendingReason(taskId:centos-devel-hello_pending-0-1cabf9d3-d315-4bd9-bf1c-8121f4801084, reason:Insufficient: CPU)


o    Retrieve a list of running jobs

§  The response contains a rich amount of information about the job.

§  Sample parsed response:
# instanceCount: 1
   >> Job Key <<
         # name: hello_world
         # role: centos
         # environment: devel
   >> Identity <<
         # owner: centos
   >> Task Config <<
         # numCPUs: 0.1
         # diskMb: 8
         # ramMb: 1

# priority: 0

Next Steps:


·         Complete implementation for all functions relevant to Airavata job submission/monitoring.

·         Dynamically add slaves based on health of jobs/cluster.

·         Find out how to enable socket based communication using binary protocol with Aurora Scheduler on our cluster.

Thanks and Regards,
Gourav Shenoy

From: "Shenoy, Gourav Ganesh" <go...@indiana.edu>
Reply-To: "dev@airavata.apache.org" <de...@airavata.apache.org>
Date: Friday, October 14, 2016 at 11:04 PM
To: "dev@airavata.apache.org" <de...@airavata.apache.org>
Subject: Apache Aurora Scheduler APIs (Thrift)

Hi dev,

I am working with building a Thrift client for Apache Aurora Scheduler running on a Mesos cluster. Apparently, the Apache Aurora documentation provided very little information about the Thrift APIs that Aurora exposed. One way to get to know what services are exposed - is by going through the "api.thrift" file on Aurora github (https://github.com/apache/aurora/blob/master/api/src/main/thrift/org/apache/aurora/gen/api.thrift).  Reading through that file to figure out the APIs can be daunting.

I have installed Aurora on a Mesos cluster on EC2 to carry out tests, and the UI dashboard for Aurora provides a wide range of useful information. On the dashboard they have provided a link "Scheduler API" which gives a comprehensive list of all Thrift services/APIs that the Aurora scheduler exposes. I think this is very useful for anyone who plans to write a client.

I have taken a dump of this html and loaded it on S3: https://s3-us-west-2.amazonaws.com/apache-aurora/thrift_module_api.htm for reference.

Snapshot:
[cid:image001.png@01D228BF.00933E40]

Thanks and Regards,
Gourav Shenoy