You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airavata.apache.org by Mangirish Wagle <va...@gmail.com> on 2016/04/06 06:08:05 UTC

Re: [GSOC Proposal] Cloud based clusters for Apache Airavata

Hello,

I have managed to put together a Cloud Interface project as initial POC
with utility functions to create, delete servers. I have created a common
cloud interface which has been implemented for Openstack Clouds using
Openstack4j.

A maven build has been setup for the project and a sample unit test has
been added to the project to test and demonstrate a server create with
associated keypair and delete operation on Jetstream Openstack using scigap
credentials. A README file added to the project contain the steps to test
run the project.

The current code does not handle the network setup that is required to make
the virtual machines created, accessible over the public network. I shall
work on getting this done as soon as I find some time out of my academic
activities and schedule.

I have created following pull request for the current code from my forked
repo to Airavata repo:-

https://github.com/apache/airavata/pull/30

You may please review and let me know your comments.

Thanks.

Best Regards,
Mangirish


On Thu, Mar 24, 2016 at 9:42 PM, Suresh Marru <sm...@apache.org> wrote:

> Hi Mangirish,
>
> Yes now I noticed the scaling within the heat section. Yes it makes sense
> to leave it behind the orchestration layer not to re-invent that logic.
>
> Airavata Orchestrator will be the natural plan to call the provisioning
> service and bootstrap the mesos cluster.  The ansible I referred to are not
> yet contributed into the repo. I am cc’ing Pankaj and Renan who can
> probably make that contribution. You can read about their effort in
> http://onlinelibrary.wiley.com/doi/10.1002/cpe.3708/full
>
> Renan,
>
> Mangirish is proposing a project to programmatically interact with Cloud
> Interfaces (like Open Stack on Jetstream) and provision resources. I would
> assume then the component you have developed will take over and bootstrap
> the mesos cluster which GFac can then submit jobs to (through Aurora).
>
> Suresh
>
>
> On Mar 24, 2016, at 9:14 PM, Mangirish Wagle <va...@gmail.com>
> wrote:
>
> Hello,
>
> I was trying to understand the end result flow of the Airavata with Cloud
> Orchestrator and had the following question:-
>
> Once the cluster has been setup, as we discussed, an ansible or some
> configuration management tool would boostrap and configure mesos. Which
> component in Airavata would host and call the ansible script and what event
> would trigger it?
>
> Thanks.
>
> Regards,
> Mangirish
>
> On Thu, Mar 24, 2016 at 9:07 PM, Mangirish Wagle <vaglomangirish@gmail.com
> > wrote:
>
>> Thanks for your feedback Suresh!
>>
>> I have mentioned about the Autoscaling in the Heat Orchestration
>> solution, which does the dynamic scaling of resources in an existing cloud.
>> Please let me know if you think that needs to be restructured.
>>
>> Also, I have updated the Google doc and Wiki with the revised proposal,
>> after making changes as per Marlon's review comments.
>>
>> I request you to please review again and check if there is anything that
>> needs still needs to be revised.
>>
>> Thank you!
>>
>> Regards,
>> Mangirish
>>
>> On Thu, Mar 24, 2016 at 7:18 PM, Suresh Marru <sm...@apache.org> wrote:
>>
>>> Hi Mangirish,
>>>
>>> Your proposal has all the required good detail. One optional addition
>>> you can clarify on if you can expand or contract resources to a previously
>>> provisioned cloud.
>>>
>>> Suresh
>>>
>>> On Mar 23, 2016, at 9:10 PM, Mangirish Wagle <va...@gmail.com>
>>> wrote:
>>>
>>> Thanks Shameera for the info and sharing the JIRA Epic details.
>>>
>>> I have drafted my GSOC Proposal for the project and I request you to
>>> please review the same:-
>>>
>>>
>>> https://cwiki.apache.org/confluence/display/AIRAVATA/GSOC+Proposal-+Cloud+Based+Clusters+for+Apache+Airavata
>>>
>>> I shall submit this on the GSOC portal by tomorrow, once I get my
>>> enrollment verification proof.
>>>
>>> Regards,
>>> Mangirish
>>>
>>>
>>>
>>> On Wed, Mar 23, 2016 at 12:29 PM, Shameera Rathnayaka <
>>> shameerainfo@gmail.com> wrote:
>>>
>>>> Hi Mangirish,
>>>>
>>>> Yes your above understanding is right. Gfac is like task executor which
>>>> execute what ever task given by Orchestrator.
>>>>
>>>> Here is the epic https://issues.apache.org/jira/browse/AIRAVATA-1924,
>>>> Open stack integration is part of this epic, you can create a new top level
>>>> jira ticket and create subtask under that ticket.
>>>>
>>>> Regards,
>>>> Shameera.
>>>>
>>>> On Wed, Mar 23, 2016 at 12:20 PM Mangirish Wagle <
>>>> vaglomangirish@gmail.com> wrote:
>>>>
>>>>> Thanks Marlon for the info. So what I get is that the Orchestrator
>>>>> would decide if the job needs to be submitted to cloud based cluster and
>>>>> route it to GFAC which would have a separate interfacing with the cloud
>>>>> cluster service.
>>>>>
>>>>> Also I wanted to know if there is any Story/ Epic created in JIRA for
>>>>> this project which I can use to create and track tasks? If not can I create
>>>>> one?
>>>>>
>>>>> Thanks.
>>>>>
>>>>> Regards,
>>>>> Mangirish
>>>>>
>>>>> On Wed, Mar 23, 2016 at 12:01 PM, Pierce, Marlon <ma...@iu.edu>
>>>>> wrote:
>>>>>
>>>>>> The Application Factory component is called “gfac” in the code base.
>>>>>> This is the part that handles the interfacing to the remote resource (most
>>>>>> often by ssh but other providers exist). The Orchestrator routes jobs to
>>>>>> GFAC instances.
>>>>>>
>>>>>> From: Mangirish Wagle <va...@gmail.com>
>>>>>> Reply-To: "dev@airavata.apache.org" <de...@airavata.apache.org>
>>>>>> Date: Wednesday, March 23, 2016 at 11:56 AM
>>>>>> To: "dev@airavata.apache.org" <de...@airavata.apache.org>
>>>>>> Subject: Re: [GSOC Proposal] Cloud based clusters for Apache Airavata
>>>>>>
>>>>>> Hello Team,
>>>>>>
>>>>>> I was drafting the GSOC proposal and I just had a quick question
>>>>>> about the integration of the project with Apache Airavata.
>>>>>>
>>>>>> Which is the component in Airavata that would call the service to
>>>>>> provision the cloud cluster?
>>>>>>
>>>>>> I am looking at the Airavata architecture diagram and my
>>>>>> understanding is that this would be treated as a new Application and would
>>>>>> have a separate application interface in 'Application Factory' component.
>>>>>> Also the workflow orchestrator would be having the intelligence to figure
>>>>>> out which jobs to be submitted to cloud based clusters.
>>>>>>
>>>>>> Please let me know whether my understanding is correct.
>>>>>>
>>>>>> Thank you.
>>>>>>
>>>>>> Best Regards,
>>>>>> Mangirish Wagle
>>>>>>
>>>>>> On Tue, Mar 22, 2016 at 2:28 PM, Pierce, Marlon <ma...@iu.edu>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Mangirish, please add your proposal to the GSOC 2016 site.
>>>>>>>
>>>>>>> From: Mangirish Wagle <va...@gmail.com>
>>>>>>> Reply-To: "dev@airavata.apache.org" <de...@airavata.apache.org>
>>>>>>> Date: Thursday, March 17, 2016 at 3:35 PM
>>>>>>> To: "dev@airavata.apache.org" <de...@airavata.apache.org>
>>>>>>> Subject: [GSOC Proposal] Cloud based clusters for Apache Airavata
>>>>>>>
>>>>>>> Hello Dev Team,
>>>>>>>
>>>>>>> I had the opportunity to interact with Suresh and Shameera wherein
>>>>>>> we discussed an open requirement in Airavata to be addressed. The
>>>>>>> requirement is to expand the capabilities of Apache Airavata to submit jobs
>>>>>>> to cloud based clusters in addition to HPC/ HTC clusters.
>>>>>>>
>>>>>>> The idea is to dynamically provision a cloud cluster in an
>>>>>>> environment like Jetstream, based on the configuration figured out by
>>>>>>> Airavata, which would be operated by a distributed system management
>>>>>>> software like Mesos. An initial high level goals would be:-
>>>>>>>
>>>>>>>    1. Airavata categorizes certain jobs to be run on cloud based
>>>>>>>    clusters and figure out the required hardware config for the cluster.
>>>>>>>    2. The proposed service would provision the cluster with the
>>>>>>>    required resources.
>>>>>>>    3. An ansible script would configure a Mesos cluster with the
>>>>>>>    resources provisioned.
>>>>>>>    4. Airavata submits the job to the Mesos cluster.
>>>>>>>    5. Mesos then figures out the efficient resource allocation
>>>>>>>    within the cluster and runs the job and fetches the result.
>>>>>>>    6. The cluster is then deprovisioned automatically when not in
>>>>>>>    use.
>>>>>>>
>>>>>>> The project would mainly focus on point 2 and 6 above.
>>>>>>>
>>>>>>> To start with, I am currently trying to get a working prototype of
>>>>>>> setting up compute nodes on an openstack environment using JClouds
>>>>>>> (Targetted for Jetstream). Also, I am planning to explore the option of
>>>>>>> using Openstack Heat engine to orchestrate the cluster. However, going
>>>>>>> ahead Airavata would be supporting other clouds like Amazon EC2 or Comet
>>>>>>> cluster, so we need to have a generic solution for achieving the goal.
>>>>>>>
>>>>>>> Another approach which might be efficient in terms of performance
>>>>>>> and time is using a container based clouds using Docker, Kubernetes which
>>>>>>> would have substantially less bootstrap time compared to cloud VMs. This
>>>>>>> would be a future prospect as we may not have all the clusters supporting
>>>>>>> containerization.
>>>>>>>
>>>>>>> This has been considered as a potential GSOC project and I would be
>>>>>>> working on drafting a proposal on this idea.
>>>>>>>
>>>>>>> Any inputs/ comments/ suggestions would be very helpful.
>>>>>>>
>>>>>>> Best Regards,
>>>>>>> Mangirish Wagle
>>>>>>>
>>>>>>
>>>>>>
>>>>> --
>>>> Shameera Rathnayaka
>>>>
>>>
>>>
>>>
>>
>
>

Re: [GSOC Proposal] Cloud based clusters for Apache Airavata

Posted by Suresh Marru <sm...@apache.org>.
Thanks Mangirish for your contribution, this is very neat implementation. 

Renan, Pankaj, 

Can we get on a google hangout and brainstorm on how to integrate the Auroa/Mesos work you are doing with the Open Stack integration Mangirish has contributed to. 

Suresh

> On Apr 16, 2016, at 12:57 AM, Mangirish Wagle <va...@gmail.com> wrote:
> 
> Hello Team,
> 
> I have created a new pull request with the changes that I added today:-
> https://github.com/apache/airavata/pull/32 <https://github.com/apache/airavata/pull/32>
> 
> Following are the main changes added with this request:-
> 1) Added method to Cloud Interface to associate floating ip. The floating ip will also get deallocated on deletion of server instance.
> 2) Changed the methods to use network name instead of network id, read from the properties, for better understanding.
> 3) Added log statements in the Interface implementation for OpenStack (Jetstream). 
> 
> Thanks.
> 
> Regards,
> Mangirish
> 
> On Tue, Apr 12, 2016 at 12:56 AM, Mangirish Wagle <vaglomangirish@gmail.com <ma...@gmail.com>> wrote:
> Hello,
> 
> I have created a new pull request for cloud-provisioning project after making all the changes suggested during the code review conducted today during the meeting with Suresh and Shameera. Following is the link:-
> https://github.com/apache/airavata/pull/31 <https://github.com/apache/airavata/pull/31>
> 
> Also, for the team's awareness, we have managed to configure a new network topology in the Jetstream Openstack cloud. The name of the network is "airavata" and it is connected to the "public" network using a router. This now enables us to provision instances and associate publicly accessible floating IPs so that they are accessible (over ssh) from Internet.
> 
> Thanks.
> 
> Best Regards,
> Mangirish
> 
> On Wed, Apr 6, 2016 at 12:08 AM, Mangirish Wagle <vaglomangirish@gmail.com <ma...@gmail.com>> wrote:
> Hello,
> 
> I have managed to put together a Cloud Interface project as initial POC with utility functions to create, delete servers. I have created a common cloud interface which has been implemented for Openstack Clouds using Openstack4j.
> 
> A maven build has been setup for the project and a sample unit test has been added to the project to test and demonstrate a server create with associated keypair and delete operation on Jetstream Openstack using scigap credentials. A README file added to the project contain the steps to test run the project.
> 
> The current code does not handle the network setup that is required to make the virtual machines created, accessible over the public network. I shall work on getting this done as soon as I find some time out of my academic activities and schedule.
> 
> I have created following pull request for the current code from my forked repo to Airavata repo:-
> 
> https://github.com/apache/airavata/pull/30 <https://github.com/apache/airavata/pull/30>
> 
> You may please review and let me know your comments.
> 
> Thanks.
> 
> Best Regards,
> Mangirish
> 
> 
> On Thu, Mar 24, 2016 at 9:42 PM, Suresh Marru <smarru@apache.org <ma...@apache.org>> wrote:
> Hi Mangirish,
> 
> Yes now I noticed the scaling within the heat section. Yes it makes sense to leave it behind the orchestration layer not to re-invent that logic.
> 
> Airavata Orchestrator will be the natural plan to call the provisioning service and bootstrap the mesos cluster.  The ansible I referred to are not yet contributed into the repo. I am cc’ing Pankaj and Renan who can probably make that contribution. You can read about their effort in http://onlinelibrary.wiley.com/doi/10.1002/cpe.3708/full <http://onlinelibrary.wiley.com/doi/10.1002/cpe.3708/full>
> 
> Renan, 
> 
> Mangirish is proposing a project to programmatically interact with Cloud Interfaces (like Open Stack on Jetstream) and provision resources. I would assume then the component you have developed will take over and bootstrap the mesos cluster which GFac can then submit jobs to (through Aurora).  
> 
> Suresh
> 
> 
>> On Mar 24, 2016, at 9:14 PM, Mangirish Wagle <vaglomangirish@gmail.com <ma...@gmail.com>> wrote:
>> 
>> Hello,
>> 
>> I was trying to understand the end result flow of the Airavata with Cloud Orchestrator and had the following question:-
>> 
>> Once the cluster has been setup, as we discussed, an ansible or some configuration management tool would boostrap and configure mesos. Which component in Airavata would host and call the ansible script and what event would trigger it?
>> 
>> Thanks.
>> 
>> Regards,
>> Mangirish
>> 
>> On Thu, Mar 24, 2016 at 9:07 PM, Mangirish Wagle <vaglomangirish@gmail.com <ma...@gmail.com>> wrote:
>> Thanks for your feedback Suresh!
>> 
>> I have mentioned about the Autoscaling in the Heat Orchestration solution, which does the dynamic scaling of resources in an existing cloud. Please let me know if you think that needs to be restructured.
>> 
>> Also, I have updated the Google doc and Wiki with the revised proposal, after making changes as per Marlon's review comments.
>> 
>> I request you to please review again and check if there is anything that needs still needs to be revised.
>> 
>> Thank you!
>> 
>> Regards,
>> Mangirish
>> 
>> On Thu, Mar 24, 2016 at 7:18 PM, Suresh Marru <smarru@apache.org <ma...@apache.org>> wrote:
>> Hi Mangirish,
>> 
>> Your proposal has all the required good detail. One optional addition you can clarify on if you can expand or contract resources to a previously provisioned cloud. 
>> 
>> Suresh
>> 
>>> On Mar 23, 2016, at 9:10 PM, Mangirish Wagle <vaglomangirish@gmail.com <ma...@gmail.com>> wrote:
>>> 
>>> Thanks Shameera for the info and sharing the JIRA Epic details.
>>> 
>>> I have drafted my GSOC Proposal for the project and I request you to please review the same:-
>>> 
>>> https://cwiki.apache.org/confluence/display/AIRAVATA/GSOC+Proposal-+Cloud+Based+Clusters+for+Apache+Airavata <https://cwiki.apache.org/confluence/display/AIRAVATA/GSOC+Proposal-+Cloud+Based+Clusters+for+Apache+Airavata>
>>> 
>>> I shall submit this on the GSOC portal by tomorrow, once I get my enrollment verification proof.
>>> 
>>> Regards,
>>> Mangirish
>>> 
>>> 
>>> 
>>> On Wed, Mar 23, 2016 at 12:29 PM, Shameera Rathnayaka <shameerainfo@gmail.com <ma...@gmail.com>> wrote:
>>> Hi Mangirish, 
>>> 
>>> Yes your above understanding is right. Gfac is like task executor which execute what ever task given by Orchestrator. 
>>> 
>>> Here is the epic https://issues.apache.org/jira/browse/AIRAVATA-1924 <https://issues.apache.org/jira/browse/AIRAVATA-1924>, Open stack integration is part of this epic, you can create a new top level jira ticket and create subtask under that ticket. 
>>> 
>>> Regards, 
>>> Shameera.
>>> 
>>> On Wed, Mar 23, 2016 at 12:20 PM Mangirish Wagle <vaglomangirish@gmail.com <ma...@gmail.com>> wrote:
>>> Thanks Marlon for the info. So what I get is that the Orchestrator would decide if the job needs to be submitted to cloud based cluster and route it to GFAC which would have a separate interfacing with the cloud cluster service.
>>> 
>>> Also I wanted to know if there is any Story/ Epic created in JIRA for this project which I can use to create and track tasks? If not can I create one?
>>> 
>>> Thanks.
>>> 
>>> Regards,
>>> Mangirish
>>> 
>>> On Wed, Mar 23, 2016 at 12:01 PM, Pierce, Marlon <marpierc@iu.edu <ma...@iu.edu>> wrote:
>>> The Application Factory component is called “gfac” in the code base.  This is the part that handles the interfacing to the remote resource (most often by ssh but other providers exist). The Orchestrator routes jobs to GFAC instances.
>>> 
>>> From: Mangirish Wagle <vaglomangirish@gmail.com <ma...@gmail.com>>
>>> Reply-To: "dev@airavata.apache.org <ma...@airavata.apache.org>" <dev@airavata.apache.org <ma...@airavata.apache.org>>
>>> Date: Wednesday, March 23, 2016 at 11:56 AM
>>> To: "dev@airavata.apache.org <ma...@airavata.apache.org>" <dev@airavata.apache.org <ma...@airavata.apache.org>>
>>> Subject: Re: [GSOC Proposal] Cloud based clusters for Apache Airavata
>>> 
>>> Hello Team,
>>> 
>>> I was drafting the GSOC proposal and I just had a quick question about the integration of the project with Apache Airavata.
>>> 
>>> Which is the component in Airavata that would call the service to provision the cloud cluster?
>>> 
>>> I am looking at the Airavata architecture diagram and my understanding is that this would be treated as a new Application and would have a separate application interface in 'Application Factory' component. Also the workflow orchestrator would be having the intelligence to figure out which jobs to be submitted to cloud based clusters.
>>> 
>>> Please let me know whether my understanding is correct.
>>> 
>>> Thank you.
>>> 
>>> Best Regards,
>>> Mangirish Wagle
>>> 
>>> On Tue, Mar 22, 2016 at 2:28 PM, Pierce, Marlon <marpierc@iu.edu <ma...@iu.edu>> wrote:
>>> Hi Mangirish, please add your proposal to the GSOC 2016 site.
>>> 
>>> From: Mangirish Wagle <vaglomangirish@gmail.com <ma...@gmail.com>>
>>> Reply-To: "dev@airavata.apache.org <ma...@airavata.apache.org>" <dev@airavata.apache.org <ma...@airavata.apache.org>>
>>> Date: Thursday, March 17, 2016 at 3:35 PM
>>> To: "dev@airavata.apache.org <ma...@airavata.apache.org>" <dev@airavata.apache.org <ma...@airavata.apache.org>>
>>> Subject: [GSOC Proposal] Cloud based clusters for Apache Airavata
>>> 
>>> Hello Dev Team,
>>> 
>>> I had the opportunity to interact with Suresh and Shameera wherein we discussed an open requirement in Airavata to be addressed. The requirement is to expand the capabilities of Apache Airavata to submit jobs to cloud based clusters in addition to HPC/ HTC clusters.
>>> 
>>> The idea is to dynamically provision a cloud cluster in an environment like Jetstream, based on the configuration figured out by Airavata, which would be operated by a distributed system management software like Mesos. An initial high level goals would be:-
>>> Airavata categorizes certain jobs to be run on cloud based clusters and figure out the required hardware config for the cluster.
>>> The proposed service would provision the cluster with the required resources.
>>> An ansible script would configure a Mesos cluster with the resources provisioned.
>>> Airavata submits the job to the Mesos cluster.
>>> Mesos then figures out the efficient resource allocation within the cluster and runs the job and fetches the result.
>>> The cluster is then deprovisioned automatically when not in use.
>>> The project would mainly focus on point 2 and 6 above.
>>> 
>>> To start with, I am currently trying to get a working prototype of setting up compute nodes on an openstack environment using JClouds (Targetted for Jetstream). Also, I am planning to explore the option of using Openstack Heat engine to orchestrate the cluster. However, going ahead Airavata would be supporting other clouds like Amazon EC2 or Comet cluster, so we need to have a generic solution for achieving the goal.
>>> 
>>> Another approach which might be efficient in terms of performance and time is using a container based clouds using Docker, Kubernetes which would have substantially less bootstrap time compared to cloud VMs. This would be a future prospect as we may not have all the clusters supporting containerization.
>>> 
>>> This has been considered as a potential GSOC project and I would be working on drafting a proposal on this idea.
>>> 
>>> Any inputs/ comments/ suggestions would be very helpful.
>>> 
>>> Best Regards,
>>> Mangirish Wagle
>>> 
>>> 
>>> -- 
>>> Shameera Rathnayaka
>>> 
>> 
>> 
>> 
> 
> 
> 
> 


Re: [GSOC Proposal] Cloud based clusters for Apache Airavata

Posted by Mangirish Wagle <va...@gmail.com>.
Hello Team,

I have created a new pull request with the changes that I added today:-
https://github.com/apache/airavata/pull/32

Following are the main changes added with this request:-
1) Added method to Cloud Interface to associate floating ip. The floating
ip will also get deallocated on deletion of server instance.
2) Changed the methods to use network name instead of network id, read from
the properties, for better understanding.
3) Added log statements in the Interface implementation for OpenStack
(Jetstream).

Thanks.

Regards,
Mangirish

On Tue, Apr 12, 2016 at 12:56 AM, Mangirish Wagle <va...@gmail.com>
wrote:

> Hello,
>
> I have created a new pull request for cloud-provisioning project after
> making all the changes suggested during the code review conducted today
> during the meeting with Suresh and Shameera. Following is the link:-
> https://github.com/apache/airavata/pull/31
>
> Also, for the team's awareness, we have managed to configure a new network
> topology in the Jetstream Openstack cloud. The name of the network is
> "airavata" and it is connected to the "public" network using a router. This
> now enables us to provision instances and associate publicly accessible
> floating IPs so that they are accessible (over ssh) from Internet.
>
> Thanks.
>
> Best Regards,
> Mangirish
>
> On Wed, Apr 6, 2016 at 12:08 AM, Mangirish Wagle <vaglomangirish@gmail.com
> > wrote:
>
>> Hello,
>>
>> I have managed to put together a Cloud Interface project as initial POC
>> with utility functions to create, delete servers. I have created a common
>> cloud interface which has been implemented for Openstack Clouds using
>> Openstack4j.
>>
>> A maven build has been setup for the project and a sample unit test has
>> been added to the project to test and demonstrate a server create with
>> associated keypair and delete operation on Jetstream Openstack using scigap
>> credentials. A README file added to the project contain the steps to test
>> run the project.
>>
>> The current code does not handle the network setup that is required to
>> make the virtual machines created, accessible over the public network. I
>> shall work on getting this done as soon as I find some time out of my
>> academic activities and schedule.
>>
>> I have created following pull request for the current code from my forked
>> repo to Airavata repo:-
>>
>> https://github.com/apache/airavata/pull/30
>>
>> You may please review and let me know your comments.
>>
>> Thanks.
>>
>> Best Regards,
>> Mangirish
>>
>>
>> On Thu, Mar 24, 2016 at 9:42 PM, Suresh Marru <sm...@apache.org> wrote:
>>
>>> Hi Mangirish,
>>>
>>> Yes now I noticed the scaling within the heat section. Yes it makes
>>> sense to leave it behind the orchestration layer not to re-invent that
>>> logic.
>>>
>>> Airavata Orchestrator will be the natural plan to call the provisioning
>>> service and bootstrap the mesos cluster.  The ansible I referred to are not
>>> yet contributed into the repo. I am cc’ing Pankaj and Renan who can
>>> probably make that contribution. You can read about their effort in
>>> http://onlinelibrary.wiley.com/doi/10.1002/cpe.3708/full
>>>
>>> Renan,
>>>
>>> Mangirish is proposing a project to programmatically interact with Cloud
>>> Interfaces (like Open Stack on Jetstream) and provision resources. I would
>>> assume then the component you have developed will take over and bootstrap
>>> the mesos cluster which GFac can then submit jobs to (through Aurora).
>>>
>>> Suresh
>>>
>>>
>>> On Mar 24, 2016, at 9:14 PM, Mangirish Wagle <va...@gmail.com>
>>> wrote:
>>>
>>> Hello,
>>>
>>> I was trying to understand the end result flow of the Airavata with
>>> Cloud Orchestrator and had the following question:-
>>>
>>> Once the cluster has been setup, as we discussed, an ansible or some
>>> configuration management tool would boostrap and configure mesos. Which
>>> component in Airavata would host and call the ansible script and what event
>>> would trigger it?
>>>
>>> Thanks.
>>>
>>> Regards,
>>> Mangirish
>>>
>>> On Thu, Mar 24, 2016 at 9:07 PM, Mangirish Wagle <
>>> vaglomangirish@gmail.com> wrote:
>>>
>>>> Thanks for your feedback Suresh!
>>>>
>>>> I have mentioned about the Autoscaling in the Heat Orchestration
>>>> solution, which does the dynamic scaling of resources in an existing cloud.
>>>> Please let me know if you think that needs to be restructured.
>>>>
>>>> Also, I have updated the Google doc and Wiki with the revised proposal,
>>>> after making changes as per Marlon's review comments.
>>>>
>>>> I request you to please review again and check if there is anything
>>>> that needs still needs to be revised.
>>>>
>>>> Thank you!
>>>>
>>>> Regards,
>>>> Mangirish
>>>>
>>>> On Thu, Mar 24, 2016 at 7:18 PM, Suresh Marru <sm...@apache.org>
>>>> wrote:
>>>>
>>>>> Hi Mangirish,
>>>>>
>>>>> Your proposal has all the required good detail. One optional addition
>>>>> you can clarify on if you can expand or contract resources to a previously
>>>>> provisioned cloud.
>>>>>
>>>>> Suresh
>>>>>
>>>>> On Mar 23, 2016, at 9:10 PM, Mangirish Wagle <va...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> Thanks Shameera for the info and sharing the JIRA Epic details.
>>>>>
>>>>> I have drafted my GSOC Proposal for the project and I request you to
>>>>> please review the same:-
>>>>>
>>>>>
>>>>> https://cwiki.apache.org/confluence/display/AIRAVATA/GSOC+Proposal-+Cloud+Based+Clusters+for+Apache+Airavata
>>>>>
>>>>> I shall submit this on the GSOC portal by tomorrow, once I get my
>>>>> enrollment verification proof.
>>>>>
>>>>> Regards,
>>>>> Mangirish
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Mar 23, 2016 at 12:29 PM, Shameera Rathnayaka <
>>>>> shameerainfo@gmail.com> wrote:
>>>>>
>>>>>> Hi Mangirish,
>>>>>>
>>>>>> Yes your above understanding is right. Gfac is like task executor
>>>>>> which execute what ever task given by Orchestrator.
>>>>>>
>>>>>> Here is the epic https://issues.apache.org/jira/browse/AIRAVATA-1924,
>>>>>> Open stack integration is part of this epic, you can create a new top level
>>>>>> jira ticket and create subtask under that ticket.
>>>>>>
>>>>>> Regards,
>>>>>> Shameera.
>>>>>>
>>>>>> On Wed, Mar 23, 2016 at 12:20 PM Mangirish Wagle <
>>>>>> vaglomangirish@gmail.com> wrote:
>>>>>>
>>>>>>> Thanks Marlon for the info. So what I get is that the Orchestrator
>>>>>>> would decide if the job needs to be submitted to cloud based cluster and
>>>>>>> route it to GFAC which would have a separate interfacing with the cloud
>>>>>>> cluster service.
>>>>>>>
>>>>>>> Also I wanted to know if there is any Story/ Epic created in JIRA
>>>>>>> for this project which I can use to create and track tasks? If not can I
>>>>>>> create one?
>>>>>>>
>>>>>>> Thanks.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Mangirish
>>>>>>>
>>>>>>> On Wed, Mar 23, 2016 at 12:01 PM, Pierce, Marlon <ma...@iu.edu>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> The Application Factory component is called “gfac” in the code
>>>>>>>> base.  This is the part that handles the interfacing to the remote resource
>>>>>>>> (most often by ssh but other providers exist). The Orchestrator routes jobs
>>>>>>>> to GFAC instances.
>>>>>>>>
>>>>>>>> From: Mangirish Wagle <va...@gmail.com>
>>>>>>>> Reply-To: "dev@airavata.apache.org" <de...@airavata.apache.org>
>>>>>>>> Date: Wednesday, March 23, 2016 at 11:56 AM
>>>>>>>> To: "dev@airavata.apache.org" <de...@airavata.apache.org>
>>>>>>>> Subject: Re: [GSOC Proposal] Cloud based clusters for Apache
>>>>>>>> Airavata
>>>>>>>>
>>>>>>>> Hello Team,
>>>>>>>>
>>>>>>>> I was drafting the GSOC proposal and I just had a quick question
>>>>>>>> about the integration of the project with Apache Airavata.
>>>>>>>>
>>>>>>>> Which is the component in Airavata that would call the service to
>>>>>>>> provision the cloud cluster?
>>>>>>>>
>>>>>>>> I am looking at the Airavata architecture diagram and my
>>>>>>>> understanding is that this would be treated as a new Application and would
>>>>>>>> have a separate application interface in 'Application Factory' component.
>>>>>>>> Also the workflow orchestrator would be having the intelligence to figure
>>>>>>>> out which jobs to be submitted to cloud based clusters.
>>>>>>>>
>>>>>>>> Please let me know whether my understanding is correct.
>>>>>>>>
>>>>>>>> Thank you.
>>>>>>>>
>>>>>>>> Best Regards,
>>>>>>>> Mangirish Wagle
>>>>>>>>
>>>>>>>> On Tue, Mar 22, 2016 at 2:28 PM, Pierce, Marlon <ma...@iu.edu>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi Mangirish, please add your proposal to the GSOC 2016 site.
>>>>>>>>>
>>>>>>>>> From: Mangirish Wagle <va...@gmail.com>
>>>>>>>>> Reply-To: "dev@airavata.apache.org" <de...@airavata.apache.org>
>>>>>>>>> Date: Thursday, March 17, 2016 at 3:35 PM
>>>>>>>>> To: "dev@airavata.apache.org" <de...@airavata.apache.org>
>>>>>>>>> Subject: [GSOC Proposal] Cloud based clusters for Apache Airavata
>>>>>>>>>
>>>>>>>>> Hello Dev Team,
>>>>>>>>>
>>>>>>>>> I had the opportunity to interact with Suresh and Shameera wherein
>>>>>>>>> we discussed an open requirement in Airavata to be addressed. The
>>>>>>>>> requirement is to expand the capabilities of Apache Airavata to submit jobs
>>>>>>>>> to cloud based clusters in addition to HPC/ HTC clusters.
>>>>>>>>>
>>>>>>>>> The idea is to dynamically provision a cloud cluster in an
>>>>>>>>> environment like Jetstream, based on the configuration figured out by
>>>>>>>>> Airavata, which would be operated by a distributed system management
>>>>>>>>> software like Mesos. An initial high level goals would be:-
>>>>>>>>>
>>>>>>>>>    1. Airavata categorizes certain jobs to be run on cloud based
>>>>>>>>>    clusters and figure out the required hardware config for the cluster.
>>>>>>>>>    2. The proposed service would provision the cluster with the
>>>>>>>>>    required resources.
>>>>>>>>>    3. An ansible script would configure a Mesos cluster with the
>>>>>>>>>    resources provisioned.
>>>>>>>>>    4. Airavata submits the job to the Mesos cluster.
>>>>>>>>>    5. Mesos then figures out the efficient resource allocation
>>>>>>>>>    within the cluster and runs the job and fetches the result.
>>>>>>>>>    6. The cluster is then deprovisioned automatically when not in
>>>>>>>>>    use.
>>>>>>>>>
>>>>>>>>> The project would mainly focus on point 2 and 6 above.
>>>>>>>>>
>>>>>>>>> To start with, I am currently trying to get a working prototype of
>>>>>>>>> setting up compute nodes on an openstack environment using JClouds
>>>>>>>>> (Targetted for Jetstream). Also, I am planning to explore the option of
>>>>>>>>> using Openstack Heat engine to orchestrate the cluster. However, going
>>>>>>>>> ahead Airavata would be supporting other clouds like Amazon EC2 or Comet
>>>>>>>>> cluster, so we need to have a generic solution for achieving the goal.
>>>>>>>>>
>>>>>>>>> Another approach which might be efficient in terms of performance
>>>>>>>>> and time is using a container based clouds using Docker, Kubernetes which
>>>>>>>>> would have substantially less bootstrap time compared to cloud VMs. This
>>>>>>>>> would be a future prospect as we may not have all the clusters supporting
>>>>>>>>> containerization.
>>>>>>>>>
>>>>>>>>> This has been considered as a potential GSOC project and I would
>>>>>>>>> be working on drafting a proposal on this idea.
>>>>>>>>>
>>>>>>>>> Any inputs/ comments/ suggestions would be very helpful.
>>>>>>>>>
>>>>>>>>> Best Regards,
>>>>>>>>> Mangirish Wagle
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>> --
>>>>>> Shameera Rathnayaka
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>

Re: [GSOC Proposal] Cloud based clusters for Apache Airavata

Posted by Mangirish Wagle <va...@gmail.com>.
Hello,

I have created a new pull request for cloud-provisioning project after
making all the changes suggested during the code review conducted today
during the meeting with Suresh and Shameera. Following is the link:-
https://github.com/apache/airavata/pull/31

Also, for the team's awareness, we have managed to configure a new network
topology in the Jetstream Openstack cloud. The name of the network is
"airavata" and it is connected to the "public" network using a router. This
now enables us to provision instances and associate publicly accessible
floating IPs so that they are accessible (over ssh) from Internet.

Thanks.

Best Regards,
Mangirish

On Wed, Apr 6, 2016 at 12:08 AM, Mangirish Wagle <va...@gmail.com>
wrote:

> Hello,
>
> I have managed to put together a Cloud Interface project as initial POC
> with utility functions to create, delete servers. I have created a common
> cloud interface which has been implemented for Openstack Clouds using
> Openstack4j.
>
> A maven build has been setup for the project and a sample unit test has
> been added to the project to test and demonstrate a server create with
> associated keypair and delete operation on Jetstream Openstack using scigap
> credentials. A README file added to the project contain the steps to test
> run the project.
>
> The current code does not handle the network setup that is required to
> make the virtual machines created, accessible over the public network. I
> shall work on getting this done as soon as I find some time out of my
> academic activities and schedule.
>
> I have created following pull request for the current code from my forked
> repo to Airavata repo:-
>
> https://github.com/apache/airavata/pull/30
>
> You may please review and let me know your comments.
>
> Thanks.
>
> Best Regards,
> Mangirish
>
>
> On Thu, Mar 24, 2016 at 9:42 PM, Suresh Marru <sm...@apache.org> wrote:
>
>> Hi Mangirish,
>>
>> Yes now I noticed the scaling within the heat section. Yes it makes sense
>> to leave it behind the orchestration layer not to re-invent that logic.
>>
>> Airavata Orchestrator will be the natural plan to call the provisioning
>> service and bootstrap the mesos cluster.  The ansible I referred to are not
>> yet contributed into the repo. I am cc’ing Pankaj and Renan who can
>> probably make that contribution. You can read about their effort in
>> http://onlinelibrary.wiley.com/doi/10.1002/cpe.3708/full
>>
>> Renan,
>>
>> Mangirish is proposing a project to programmatically interact with Cloud
>> Interfaces (like Open Stack on Jetstream) and provision resources. I would
>> assume then the component you have developed will take over and bootstrap
>> the mesos cluster which GFac can then submit jobs to (through Aurora).
>>
>> Suresh
>>
>>
>> On Mar 24, 2016, at 9:14 PM, Mangirish Wagle <va...@gmail.com>
>> wrote:
>>
>> Hello,
>>
>> I was trying to understand the end result flow of the Airavata with Cloud
>> Orchestrator and had the following question:-
>>
>> Once the cluster has been setup, as we discussed, an ansible or some
>> configuration management tool would boostrap and configure mesos. Which
>> component in Airavata would host and call the ansible script and what event
>> would trigger it?
>>
>> Thanks.
>>
>> Regards,
>> Mangirish
>>
>> On Thu, Mar 24, 2016 at 9:07 PM, Mangirish Wagle <
>> vaglomangirish@gmail.com> wrote:
>>
>>> Thanks for your feedback Suresh!
>>>
>>> I have mentioned about the Autoscaling in the Heat Orchestration
>>> solution, which does the dynamic scaling of resources in an existing cloud.
>>> Please let me know if you think that needs to be restructured.
>>>
>>> Also, I have updated the Google doc and Wiki with the revised proposal,
>>> after making changes as per Marlon's review comments.
>>>
>>> I request you to please review again and check if there is anything that
>>> needs still needs to be revised.
>>>
>>> Thank you!
>>>
>>> Regards,
>>> Mangirish
>>>
>>> On Thu, Mar 24, 2016 at 7:18 PM, Suresh Marru <sm...@apache.org> wrote:
>>>
>>>> Hi Mangirish,
>>>>
>>>> Your proposal has all the required good detail. One optional addition
>>>> you can clarify on if you can expand or contract resources to a previously
>>>> provisioned cloud.
>>>>
>>>> Suresh
>>>>
>>>> On Mar 23, 2016, at 9:10 PM, Mangirish Wagle <va...@gmail.com>
>>>> wrote:
>>>>
>>>> Thanks Shameera for the info and sharing the JIRA Epic details.
>>>>
>>>> I have drafted my GSOC Proposal for the project and I request you to
>>>> please review the same:-
>>>>
>>>>
>>>> https://cwiki.apache.org/confluence/display/AIRAVATA/GSOC+Proposal-+Cloud+Based+Clusters+for+Apache+Airavata
>>>>
>>>> I shall submit this on the GSOC portal by tomorrow, once I get my
>>>> enrollment verification proof.
>>>>
>>>> Regards,
>>>> Mangirish
>>>>
>>>>
>>>>
>>>> On Wed, Mar 23, 2016 at 12:29 PM, Shameera Rathnayaka <
>>>> shameerainfo@gmail.com> wrote:
>>>>
>>>>> Hi Mangirish,
>>>>>
>>>>> Yes your above understanding is right. Gfac is like task executor
>>>>> which execute what ever task given by Orchestrator.
>>>>>
>>>>> Here is the epic https://issues.apache.org/jira/browse/AIRAVATA-1924,
>>>>> Open stack integration is part of this epic, you can create a new top level
>>>>> jira ticket and create subtask under that ticket.
>>>>>
>>>>> Regards,
>>>>> Shameera.
>>>>>
>>>>> On Wed, Mar 23, 2016 at 12:20 PM Mangirish Wagle <
>>>>> vaglomangirish@gmail.com> wrote:
>>>>>
>>>>>> Thanks Marlon for the info. So what I get is that the Orchestrator
>>>>>> would decide if the job needs to be submitted to cloud based cluster and
>>>>>> route it to GFAC which would have a separate interfacing with the cloud
>>>>>> cluster service.
>>>>>>
>>>>>> Also I wanted to know if there is any Story/ Epic created in JIRA for
>>>>>> this project which I can use to create and track tasks? If not can I create
>>>>>> one?
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>> Regards,
>>>>>> Mangirish
>>>>>>
>>>>>> On Wed, Mar 23, 2016 at 12:01 PM, Pierce, Marlon <ma...@iu.edu>
>>>>>> wrote:
>>>>>>
>>>>>>> The Application Factory component is called “gfac” in the code
>>>>>>> base.  This is the part that handles the interfacing to the remote resource
>>>>>>> (most often by ssh but other providers exist). The Orchestrator routes jobs
>>>>>>> to GFAC instances.
>>>>>>>
>>>>>>> From: Mangirish Wagle <va...@gmail.com>
>>>>>>> Reply-To: "dev@airavata.apache.org" <de...@airavata.apache.org>
>>>>>>> Date: Wednesday, March 23, 2016 at 11:56 AM
>>>>>>> To: "dev@airavata.apache.org" <de...@airavata.apache.org>
>>>>>>> Subject: Re: [GSOC Proposal] Cloud based clusters for Apache
>>>>>>> Airavata
>>>>>>>
>>>>>>> Hello Team,
>>>>>>>
>>>>>>> I was drafting the GSOC proposal and I just had a quick question
>>>>>>> about the integration of the project with Apache Airavata.
>>>>>>>
>>>>>>> Which is the component in Airavata that would call the service to
>>>>>>> provision the cloud cluster?
>>>>>>>
>>>>>>> I am looking at the Airavata architecture diagram and my
>>>>>>> understanding is that this would be treated as a new Application and would
>>>>>>> have a separate application interface in 'Application Factory' component.
>>>>>>> Also the workflow orchestrator would be having the intelligence to figure
>>>>>>> out which jobs to be submitted to cloud based clusters.
>>>>>>>
>>>>>>> Please let me know whether my understanding is correct.
>>>>>>>
>>>>>>> Thank you.
>>>>>>>
>>>>>>> Best Regards,
>>>>>>> Mangirish Wagle
>>>>>>>
>>>>>>> On Tue, Mar 22, 2016 at 2:28 PM, Pierce, Marlon <ma...@iu.edu>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Mangirish, please add your proposal to the GSOC 2016 site.
>>>>>>>>
>>>>>>>> From: Mangirish Wagle <va...@gmail.com>
>>>>>>>> Reply-To: "dev@airavata.apache.org" <de...@airavata.apache.org>
>>>>>>>> Date: Thursday, March 17, 2016 at 3:35 PM
>>>>>>>> To: "dev@airavata.apache.org" <de...@airavata.apache.org>
>>>>>>>> Subject: [GSOC Proposal] Cloud based clusters for Apache Airavata
>>>>>>>>
>>>>>>>> Hello Dev Team,
>>>>>>>>
>>>>>>>> I had the opportunity to interact with Suresh and Shameera wherein
>>>>>>>> we discussed an open requirement in Airavata to be addressed. The
>>>>>>>> requirement is to expand the capabilities of Apache Airavata to submit jobs
>>>>>>>> to cloud based clusters in addition to HPC/ HTC clusters.
>>>>>>>>
>>>>>>>> The idea is to dynamically provision a cloud cluster in an
>>>>>>>> environment like Jetstream, based on the configuration figured out by
>>>>>>>> Airavata, which would be operated by a distributed system management
>>>>>>>> software like Mesos. An initial high level goals would be:-
>>>>>>>>
>>>>>>>>    1. Airavata categorizes certain jobs to be run on cloud based
>>>>>>>>    clusters and figure out the required hardware config for the cluster.
>>>>>>>>    2. The proposed service would provision the cluster with the
>>>>>>>>    required resources.
>>>>>>>>    3. An ansible script would configure a Mesos cluster with the
>>>>>>>>    resources provisioned.
>>>>>>>>    4. Airavata submits the job to the Mesos cluster.
>>>>>>>>    5. Mesos then figures out the efficient resource allocation
>>>>>>>>    within the cluster and runs the job and fetches the result.
>>>>>>>>    6. The cluster is then deprovisioned automatically when not in
>>>>>>>>    use.
>>>>>>>>
>>>>>>>> The project would mainly focus on point 2 and 6 above.
>>>>>>>>
>>>>>>>> To start with, I am currently trying to get a working prototype of
>>>>>>>> setting up compute nodes on an openstack environment using JClouds
>>>>>>>> (Targetted for Jetstream). Also, I am planning to explore the option of
>>>>>>>> using Openstack Heat engine to orchestrate the cluster. However, going
>>>>>>>> ahead Airavata would be supporting other clouds like Amazon EC2 or Comet
>>>>>>>> cluster, so we need to have a generic solution for achieving the goal.
>>>>>>>>
>>>>>>>> Another approach which might be efficient in terms of performance
>>>>>>>> and time is using a container based clouds using Docker, Kubernetes which
>>>>>>>> would have substantially less bootstrap time compared to cloud VMs. This
>>>>>>>> would be a future prospect as we may not have all the clusters supporting
>>>>>>>> containerization.
>>>>>>>>
>>>>>>>> This has been considered as a potential GSOC project and I would be
>>>>>>>> working on drafting a proposal on this idea.
>>>>>>>>
>>>>>>>> Any inputs/ comments/ suggestions would be very helpful.
>>>>>>>>
>>>>>>>> Best Regards,
>>>>>>>> Mangirish Wagle
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> --
>>>>> Shameera Rathnayaka
>>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>