You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cloudstack.apache.org by Dharmesh Kakadia <dh...@gmail.com> on 2013/04/30 11:34:42 UTC

[GSoC][Proposal] Integration project to deploy and use Mesos on a CloudStack based cloud

Hi,

I am Dharmesh Kakdia and interested in project "Integration project to
deploy and use Mesos on a CloudStack based cloud" (
https://issues.apache.org/jira/browse/CLOUDSTACK-1784)

I am working on proposal and want to get feedback. Please provide
suggestions :)

*

Abstract:

The project aims to bring cloudformation[1] like service to cloudstack. One
of the prime use-case is cluster computing frameworks on cloudstack. A
cloudformation service will give users and administrators of cloudstack
ability to manage and control a set of resources easily. The cloudformation
will allow booting and configuring a set of VMs and form a cluster. Simple
example would be LAMP stack. More complex clusters such as mesos or hadoop
cluster requires a little more advanced configuration. There is already
some work done by Chiradeep Vittal at this front [5] using route and
sinatra. In this project, I will implement cloudformation service and
demonstrate how to run mesos cluster using it.

Mesos:

Mesos is a resource management platform for clusters [2]. It aims to
increase resource utilization of clusters by sharing cluster resources
among multiple processing frameworks(like MapReduce, MPI, Graph Processing)
or multiple instances of same framework. It provides efficient resource
isolation through use of containers. Uses zookeeper for state maintenance
and fault tolerance.

What can run on mesos ?

Spark: A cluster computing framework based on the Resilient Distributed
Datasets (RDDs) abstraction. RDD is more generalized than MapReduce and can
support iterative and interactive computation while retaining fault
tolerance, scalability, data locality etc.

Hadoop: Hadoop is fault tolerant and scalable distributed computing
framework based on MapReduce abstraction.

Begel: A graph processing framework based on pregel.

and other frameworks like MPI, Hypertable.

How to deploy mesos

Mesos provides cluster installation scripts [7] for cluster deployment.
There are also scripts available to deploy a cluster on Amazon EC2 [8].

Deliverables:

1. Cloudformation service implementation on cloudstack.

2. Integration of cloudformation with cloudmonkey, CLI tool.

2. Proof of concept of running mesos on top of cloudstack using the service.

3. Related documentation.

Architecture and Tools:

The high level architecture I propose is as follows:

  It includes following components:

1. CloudFormation ReST server:

This acts as a point of contact to and exposes CloudFormation functionality
as ReST service. This can be accessed directly or through cloudmonkey. I
will add those functionalities in cloudmonkey. I plan to use dropwizard [3]
to start with. Later may be the API server can be merged with management
server. I plan to use mysql for storing details of clusters.

2. Provisioning:

Provisioning module is responsible for handling the booting process of the
VMs through cloudstack. This uses the cloudstack APIs for launching VMs. I
plan to use preconfigured templates/images with required dependencies
installed, which will make cluster creation process much faster even for
large clusters. Error handling is very important part of this module. For
example, what you do if few VMs fail to boot in cluster ?

3. Configuration:

This module deals with configuring the VMs to form a cluster. This can be
done via manual scripts/code or via configuration management tools like
chef. I plan to use workflow automation tools like rundeck [4].

In general, I want to use tools around java as much as possible as
cloudstack is mostly in java. This will make the project easier to maintain
and develop.

Why ReST ?

I believe decoupling provided by the ReST architecture makes it easy to
extend in future.  Say for example, if one wants to extend the
cloudformation service to include features like auto-scaling of clusters
based on some user criteria (rule-based/monitoring etc).

 Services:

1. POST : create a cluster

   -

      accepts : cluster configuration json
      -

      produces : clusterId

 2. GET : get the current status of request

   -

      accepts : clusterId
      -

      produces : json describing current status if the cluster.

3. DELETE : remove a cluster

   -

      accepts : clusterId
      -

      produces : result (sucess/failure)

 4. UPDATE : adding a node to a cluster

   -

      accepts : cluster configuration json and clusterId
      -

      produces : result (sucess/failure)


Timeline:

1-1.5 week : project design. Architecture, tools selection, API design.

1-1.5 week : getting familiar with cloudstack codebase and architecture
details.

1-1.5 week : getting familiar with mesos internals.

1-1.5 week : setting up the dev environment

2-3 week : build provisioning and configuration module

Midterm evaluation: provisioning module, configuration module

1-2 week : develope ReST server

2-3 week : test and integrate

About me:

I am MS by Research student at International Institute of Information
Technology Hyderabad (IIIT-H), Hyderabad, India. I operate our small lab
cluster operating on Openstack and I am working on a similar project,
HadoopStack [6], which aims to bring data processing to a multi-cloud
environment (work in progress). My area of research is scheduling in large
scale distributed systems. I have experience with related tools like
Hadoop, Mesos, OpenStack, Chef, Ironfan and jClouds.

Email-contact : dhkakadia@gmail.com

More info: http://researchweb.iiit.ac.in/~dharmesh.kakadia/

Why me ?

I love open-source projects. I am fascinated by distributed computing and
interested in building and optimizing large scale systems and data
processing frameworks.

References

[1] http://aws.amazon.com/cloudformation/

[2] http://incubator.apache.org/mesos/

[3] http://dropwizard.codahale.com/

[4] http://rundeck.org/

[5] https://github.com/chiradeep/stackmate

[6] http://siel-iiith.github.io/HadoopStack/

[7] https://github.com/apache/mesos/blob/trunk/docs/Deploy-Scripts.textile

[8] https://github.com/apache/mesos/blob/trunk/docs/EC2-Scripts.textile
**

In case you are having trouble in reading, google docs of above is here :

https://docs.google.com/document/d/1ocoBmyHDtOVnBhCELVt1QcgkubSzCyksls2MCTuDPL0

*

Re: [GSoC][Proposal] Integration project to deploy and use Mesos on a CloudStack based cloud

Posted by Sebastien Goasguen <ru...@gmail.com>.
On May 7, 2013, at 1:12 PM, Dharmesh Kakadia <dh...@gmail.com> wrote:

> I am able to create singel instance LAMP-stack with Chiradeeps's
> implementation. I will spend a little time, understanding his code (new to
> ruby). Will take it forward from there.
> 
> Thanks,
> Dharmesh
> 

That's great news, feel free to write a "how to" and blog it.

You can also write it in docbook xml format and send a patch. I have been including third party tools examples in our documentation in the Developer's guide under tools.

If you git clone cloudstack, then check under /docs/en-US you will see examples of xml files in docbook. Then check tools.xml ( i think that's the one).

-sebastien

> On Tue, May 7, 2013 at 1:16 PM, Sebastien Goasguen <ru...@gmail.com> wrote:
> 
>> 
>> On May 1, 2013, at 7:27 AM, Dharmesh Kakadia <
>> dharmesh.kakadia@research.iiit.ac.in> wrote:
>> 
>>> Sebastien and Chiradeep, thanks for the comments !! That clarified a lot
>> of
>>> things. I just read Chiradeep's blog (
>>> 
>> http://cloudierthanthou.wordpress.com/2013/04/26/stackmate-execute-cloudformation-templates-on-cloudstack/
>> )
>>> which details the service.
>>> 
>>> I am proposing a server side implementation of cloudformation.
>>> 
>>> I misunderstood the ReST and Query API. Thanks for correcting.
>> Information
>>> here(http://gehrcke.de/2009/06/aws-about-api/) helped me. In case we
>> want
>>> to use existing AWS tools for cloudformation, we also would be designing
>>> Query API, not ReST.
>>> 
>>> Sorry for the confusion regarding cloudmonkey. I was proposing to
>> integrate
>>> cloudformation API into cloudstack source code, directly and add
>>> corresponding support in cloudmonkey. But as you suggested, it might be
>>> easy to start with prototype decoupled from cloudstack (Uses cloudstack
>> API
>>> and does not reside in cloudstack). I assume by existing cloudformation
>>> tools you mean AWS tools(
>>> http://aws.amazon.com/developertools/AWS-CloudFormation). Reusing them
>> will
>>> be a really good idea.
>> 
>> Yes first step should be with cloud formation (stackmate) outside
>> cloudstack. And I strongly suggest that you make sure it is compatible with
>> the AWS tools.
>> 
>>> 
>>> There are lot of options for configuration mgmt tools. I have used knife
>>> previously and good to know that it has cloudstack plugin based on fog (
>>> https://github.com/fifthecho/knife-cloudstack-fog). Reasons rundeck
>> looked
>>> better was support for rollbacking and is full workflow execution engine.
>>> Finally rundeck can use chef/puppet. I have seen provisonr/whirr and they
>>> look promising. Definitely a lot to explore here !!
>> 
>> Yes, I have nothing against rundeck. But it would be one more dependency.
>> 
>>> 
>>> Thanks for suggesting clear proposal.
>>> 
>>> Thanks,
>>> Dharmesh
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Wed, May 1, 2013 at 3:07 PM, Sebastien Goasguen <ru...@gmail.com>
>> wrote:
>>> 
>>>> 
>>>> On Apr 30, 2013, at 4:59 PM, Chiradeep Vittal <
>> Chiradeep.Vittal@citrix.com>
>>>> wrote:
>>>> 
>>>>> 
>>>>> 
>>>>> On 4/30/13 5:01 AM, "Sebastien Goasguen" <ru...@gmail.com> wrote:
>>>>> 
>>>>>> Dharmesh, see in-line
>>>>>> 
>>>>>> On Apr 30, 2013, at 5:34 AM, Dharmesh Kakadia <dh...@gmail.com>
>>>> wrote:
>>>>>> 
>>>>>>> Hi,
>>>>>>> 
>>>>>>> I am Dharmesh Kakdia and interested in project "Integration project
>> to
>>>>>>> deploy and use Mesos on a CloudStack based cloud" (
>>>>>>> https://issues.apache.org/jira/browse/CLOUDSTACK-1784)
>>>>>>> 
>>>>>>> I am working on proposal and want to get feedback. Please provide
>>>>>>> suggestions :)
>>>>>>> 
>>>>>>> *
>>>>>>> 
>>>>>>> Abstract:
>>>>>>> 
>>>>>>> The project aims to bring cloudformation[1] like service to
>> cloudstack.
>>>>>>> One
>>>>>>> of the prime use-case is cluster computing frameworks on cloudstack.
>> A
>>>>>>> cloudformation service will give users and administrators of
>> cloudstack
>>>>>>> ability to manage and control a set of resources easily. The
>>>>>>> cloudformation
>>>>>>> will allow booting and configuring a set of VMs and form a cluster.
>>>>>>> Simple
>>>>>>> example would be LAMP stack. More complex clusters such as mesos or
>>>>>>> hadoop
>>>>>>> cluster requires a little more advanced configuration. There is
>> already
>>>>>>> some work done by Chiradeep Vittal at this front [5] using route and
>>>>>> 
>>>>>> it's using ruote: http://ruote.rubyforge.org
>>>>>> 
>>>>>>> sinatra. In this project, I will implement cloudformation service and
>>>>>>> demonstrate how to run mesos cluster using it.
>>>>>> 
>>>>>> You will create cloud formation templates that describe a mesos
>> cluster
>>>>>> 
>>>>>>> 
>>>>>>> Mesos:
>>>>>>> 
>>>>>>> Mesos is a resource management platform for clusters [2]. It aims to
>>>>>>> increase resource utilization of clusters by sharing cluster
>> resources
>>>>>>> among multiple processing frameworks(like MapReduce, MPI, Graph
>>>>>>> Processing)
>>>>>>> or multiple instances of same framework. It provides efficient
>> resource
>>>>>>> isolation through use of containers. Uses zookeeper for state
>>>>>>> maintenance
>>>>>>> and fault tolerance.
>>>>>>> 
>>>>>>> What can run on mesos ?
>>>>>>> 
>>>>>>> Spark: A cluster computing framework based on the Resilient
>> Distributed
>>>>>>> Datasets (RDDs) abstraction. RDD is more generalized than MapReduce
>> and
>>>>>>> can
>>>>>>> support iterative and interactive computation while retaining fault
>>>>>>> tolerance, scalability, data locality etc.
>>>>>>> 
>>>>>>> Hadoop: Hadoop is fault tolerant and scalable distributed computing
>>>>>>> framework based on MapReduce abstraction.
>>>>>>> 
>>>>>>> Begel: A graph processing framework based on pregel.
>>>>>>> 
>>>>>>> and other frameworks like MPI, Hypertable.
>>>>>>> 
>>>>>>> How to deploy mesos
>>>>>>> 
>>>>>>> Mesos provides cluster installation scripts [7] for cluster
>> deployment.
>>>>>>> There are also scripts available to deploy a cluster on Amazon EC2
>> [8].
>>>>>> 
>>>>>> It would be nice to see if these scripts can be used as is with the
>>>>>> CloudStack EC2 service.
>>>>>> 
>>>>>>> 
>>>>>>> Deliverables:
>>>>>>> 
>>>>>>> 1. Cloudformation service implementation on cloudstack.
>>>>>>> 
>>>>>>> 2. Integration of cloudformation with cloudmonkey, CLI tool.
>>>>>> 
>>>>>> 2. is a little confusing. I believe that what Chiradeep prototype runs
>>>> on
>>>>>> the client side. What is needed is a server side implementation.
>>>>>> That way we could use existing cloudformation cli tools to talk to it.
>>>>>> I don't understand where cloudmonkey comes into play. CloudMonkey is a
>>>>>> cli for the CloudStack API. Unless you plan to integrate the
>>>>>> cloudformation API directly in the cloudstack source code, the
>>>>>> integration you propose is not clear to me.
>>>>>> 
>>>>> 
>>>>> Sebastien is correct. I intend to put in the query API server around
>> the
>>>>> core of stack mate soon (as soon as I'm done helping on the internal
>>>>> loadbalancer). This will be written in Ruby.
>>>>> 
>>>>> 
>>>> 
>>>> Dharmesh I suggest you propose the following:
>>>> 
>>>> 1-Deploy CloudStack and understand instance
>> configuration/contextualization
>>>> 2-Test and deploy Mesos on a set of CloudStack based VM, manually.
>>>> Design/propose an automation framework.
>>>> 3-Test stackmate and engage chiradeep (report bugs, make suggestion,
>> make
>>>> pull request)
>>>> 4-Create cloud formation template to provision a Mesos Cluster
>>>> 5-Compare with Apache Whirr or other cluster provisioning tools.
>>>> 6-Potentially if you see a link with cloudmonkey, see how you could
>> extend
>>>> it to talk to stackmate in a similar manner that it talks to CloudStack.
>>>> 
>>>> 
>>>> You are pretty close and this is a very exciting projects, so go ahead,
>>>> modify a bit your proposal and submit it.
>>>> 
>>>> Deadline for applications is this Friday May 3rd.
>>>> 
>>>> -sebastien
>>>> 
>>>> 
>>>> 
>>>> 
>> 
>> 


Re: [GSoC][Proposal] Integration project to deploy and use Mesos on a CloudStack based cloud

Posted by Dharmesh Kakadia <dh...@gmail.com>.
I am able to create singel instance LAMP-stack with Chiradeeps's
implementation. I will spend a little time, understanding his code (new to
ruby). Will take it forward from there.

Thanks,
Dharmesh

On Tue, May 7, 2013 at 1:16 PM, Sebastien Goasguen <ru...@gmail.com> wrote:

>
> On May 1, 2013, at 7:27 AM, Dharmesh Kakadia <
> dharmesh.kakadia@research.iiit.ac.in> wrote:
>
> > Sebastien and Chiradeep, thanks for the comments !! That clarified a lot
> of
> > things. I just read Chiradeep's blog (
> >
> http://cloudierthanthou.wordpress.com/2013/04/26/stackmate-execute-cloudformation-templates-on-cloudstack/
> )
> > which details the service.
> >
> > I am proposing a server side implementation of cloudformation.
> >
> > I misunderstood the ReST and Query API. Thanks for correcting.
> Information
> > here(http://gehrcke.de/2009/06/aws-about-api/) helped me. In case we
> want
> > to use existing AWS tools for cloudformation, we also would be designing
> > Query API, not ReST.
> >
> > Sorry for the confusion regarding cloudmonkey. I was proposing to
> integrate
> > cloudformation API into cloudstack source code, directly and add
> > corresponding support in cloudmonkey. But as you suggested, it might be
> > easy to start with prototype decoupled from cloudstack (Uses cloudstack
> API
> > and does not reside in cloudstack). I assume by existing cloudformation
> > tools you mean AWS tools(
> > http://aws.amazon.com/developertools/AWS-CloudFormation). Reusing them
> will
> > be a really good idea.
>
> Yes first step should be with cloud formation (stackmate) outside
> cloudstack. And I strongly suggest that you make sure it is compatible with
> the AWS tools.
>
> >
> > There are lot of options for configuration mgmt tools. I have used knife
> > previously and good to know that it has cloudstack plugin based on fog (
> > https://github.com/fifthecho/knife-cloudstack-fog). Reasons rundeck
> looked
> > better was support for rollbacking and is full workflow execution engine.
> > Finally rundeck can use chef/puppet. I have seen provisonr/whirr and they
> > look promising. Definitely a lot to explore here !!
>
> Yes, I have nothing against rundeck. But it would be one more dependency.
>
> >
> > Thanks for suggesting clear proposal.
> >
> > Thanks,
> > Dharmesh
> >
> >
> >
> >
> >
> > On Wed, May 1, 2013 at 3:07 PM, Sebastien Goasguen <ru...@gmail.com>
> wrote:
> >
> >>
> >> On Apr 30, 2013, at 4:59 PM, Chiradeep Vittal <
> Chiradeep.Vittal@citrix.com>
> >> wrote:
> >>
> >>>
> >>>
> >>> On 4/30/13 5:01 AM, "Sebastien Goasguen" <ru...@gmail.com> wrote:
> >>>
> >>>> Dharmesh, see in-line
> >>>>
> >>>> On Apr 30, 2013, at 5:34 AM, Dharmesh Kakadia <dh...@gmail.com>
> >> wrote:
> >>>>
> >>>>> Hi,
> >>>>>
> >>>>> I am Dharmesh Kakdia and interested in project "Integration project
> to
> >>>>> deploy and use Mesos on a CloudStack based cloud" (
> >>>>> https://issues.apache.org/jira/browse/CLOUDSTACK-1784)
> >>>>>
> >>>>> I am working on proposal and want to get feedback. Please provide
> >>>>> suggestions :)
> >>>>>
> >>>>> *
> >>>>>
> >>>>> Abstract:
> >>>>>
> >>>>> The project aims to bring cloudformation[1] like service to
> cloudstack.
> >>>>> One
> >>>>> of the prime use-case is cluster computing frameworks on cloudstack.
> A
> >>>>> cloudformation service will give users and administrators of
> cloudstack
> >>>>> ability to manage and control a set of resources easily. The
> >>>>> cloudformation
> >>>>> will allow booting and configuring a set of VMs and form a cluster.
> >>>>> Simple
> >>>>> example would be LAMP stack. More complex clusters such as mesos or
> >>>>> hadoop
> >>>>> cluster requires a little more advanced configuration. There is
> already
> >>>>> some work done by Chiradeep Vittal at this front [5] using route and
> >>>>
> >>>> it's using ruote: http://ruote.rubyforge.org
> >>>>
> >>>>> sinatra. In this project, I will implement cloudformation service and
> >>>>> demonstrate how to run mesos cluster using it.
> >>>>
> >>>> You will create cloud formation templates that describe a mesos
> cluster
> >>>>
> >>>>>
> >>>>> Mesos:
> >>>>>
> >>>>> Mesos is a resource management platform for clusters [2]. It aims to
> >>>>> increase resource utilization of clusters by sharing cluster
> resources
> >>>>> among multiple processing frameworks(like MapReduce, MPI, Graph
> >>>>> Processing)
> >>>>> or multiple instances of same framework. It provides efficient
> resource
> >>>>> isolation through use of containers. Uses zookeeper for state
> >>>>> maintenance
> >>>>> and fault tolerance.
> >>>>>
> >>>>> What can run on mesos ?
> >>>>>
> >>>>> Spark: A cluster computing framework based on the Resilient
> Distributed
> >>>>> Datasets (RDDs) abstraction. RDD is more generalized than MapReduce
> and
> >>>>> can
> >>>>> support iterative and interactive computation while retaining fault
> >>>>> tolerance, scalability, data locality etc.
> >>>>>
> >>>>> Hadoop: Hadoop is fault tolerant and scalable distributed computing
> >>>>> framework based on MapReduce abstraction.
> >>>>>
> >>>>> Begel: A graph processing framework based on pregel.
> >>>>>
> >>>>> and other frameworks like MPI, Hypertable.
> >>>>>
> >>>>> How to deploy mesos
> >>>>>
> >>>>> Mesos provides cluster installation scripts [7] for cluster
> deployment.
> >>>>> There are also scripts available to deploy a cluster on Amazon EC2
> [8].
> >>>>
> >>>> It would be nice to see if these scripts can be used as is with the
> >>>> CloudStack EC2 service.
> >>>>
> >>>>>
> >>>>> Deliverables:
> >>>>>
> >>>>> 1. Cloudformation service implementation on cloudstack.
> >>>>>
> >>>>> 2. Integration of cloudformation with cloudmonkey, CLI tool.
> >>>>
> >>>> 2. is a little confusing. I believe that what Chiradeep prototype runs
> >> on
> >>>> the client side. What is needed is a server side implementation.
> >>>> That way we could use existing cloudformation cli tools to talk to it.
> >>>> I don't understand where cloudmonkey comes into play. CloudMonkey is a
> >>>> cli for the CloudStack API. Unless you plan to integrate the
> >>>> cloudformation API directly in the cloudstack source code, the
> >>>> integration you propose is not clear to me.
> >>>>
> >>>
> >>> Sebastien is correct. I intend to put in the query API server around
> the
> >>> core of stack mate soon (as soon as I'm done helping on the internal
> >>> loadbalancer). This will be written in Ruby.
> >>>
> >>>
> >>
> >> Dharmesh I suggest you propose the following:
> >>
> >> 1-Deploy CloudStack and understand instance
> configuration/contextualization
> >> 2-Test and deploy Mesos on a set of CloudStack based VM, manually.
> >> Design/propose an automation framework.
> >> 3-Test stackmate and engage chiradeep (report bugs, make suggestion,
> make
> >> pull request)
> >> 4-Create cloud formation template to provision a Mesos Cluster
> >> 5-Compare with Apache Whirr or other cluster provisioning tools.
> >> 6-Potentially if you see a link with cloudmonkey, see how you could
> extend
> >> it to talk to stackmate in a similar manner that it talks to CloudStack.
> >>
> >>
> >> You are pretty close and this is a very exciting projects, so go ahead,
> >> modify a bit your proposal and submit it.
> >>
> >> Deadline for applications is this Friday May 3rd.
> >>
> >> -sebastien
> >>
> >>
> >>
> >>
>
>

Re: [GSoC][Proposal] Integration project to deploy and use Mesos on a CloudStack based cloud

Posted by Sebastien Goasguen <ru...@gmail.com>.
On May 1, 2013, at 7:27 AM, Dharmesh Kakadia <dh...@research.iiit.ac.in> wrote:

> Sebastien and Chiradeep, thanks for the comments !! That clarified a lot of
> things. I just read Chiradeep's blog (
> http://cloudierthanthou.wordpress.com/2013/04/26/stackmate-execute-cloudformation-templates-on-cloudstack/)
> which details the service.
> 
> I am proposing a server side implementation of cloudformation.
> 
> I misunderstood the ReST and Query API. Thanks for correcting. Information
> here(http://gehrcke.de/2009/06/aws-about-api/) helped me. In case we want
> to use existing AWS tools for cloudformation, we also would be designing
> Query API, not ReST.
> 
> Sorry for the confusion regarding cloudmonkey. I was proposing to integrate
> cloudformation API into cloudstack source code, directly and add
> corresponding support in cloudmonkey. But as you suggested, it might be
> easy to start with prototype decoupled from cloudstack (Uses cloudstack API
> and does not reside in cloudstack). I assume by existing cloudformation
> tools you mean AWS tools(
> http://aws.amazon.com/developertools/AWS-CloudFormation). Reusing them will
> be a really good idea.

Yes first step should be with cloud formation (stackmate) outside cloudstack. And I strongly suggest that you make sure it is compatible with the AWS tools.

> 
> There are lot of options for configuration mgmt tools. I have used knife
> previously and good to know that it has cloudstack plugin based on fog (
> https://github.com/fifthecho/knife-cloudstack-fog). Reasons rundeck looked
> better was support for rollbacking and is full workflow execution engine.
> Finally rundeck can use chef/puppet. I have seen provisonr/whirr and they
> look promising. Definitely a lot to explore here !!

Yes, I have nothing against rundeck. But it would be one more dependency.

> 
> Thanks for suggesting clear proposal.
> 
> Thanks,
> Dharmesh
> 
> 
> 
> 
> 
> On Wed, May 1, 2013 at 3:07 PM, Sebastien Goasguen <ru...@gmail.com> wrote:
> 
>> 
>> On Apr 30, 2013, at 4:59 PM, Chiradeep Vittal <Ch...@citrix.com>
>> wrote:
>> 
>>> 
>>> 
>>> On 4/30/13 5:01 AM, "Sebastien Goasguen" <ru...@gmail.com> wrote:
>>> 
>>>> Dharmesh, see in-line
>>>> 
>>>> On Apr 30, 2013, at 5:34 AM, Dharmesh Kakadia <dh...@gmail.com>
>> wrote:
>>>> 
>>>>> Hi,
>>>>> 
>>>>> I am Dharmesh Kakdia and interested in project "Integration project to
>>>>> deploy and use Mesos on a CloudStack based cloud" (
>>>>> https://issues.apache.org/jira/browse/CLOUDSTACK-1784)
>>>>> 
>>>>> I am working on proposal and want to get feedback. Please provide
>>>>> suggestions :)
>>>>> 
>>>>> *
>>>>> 
>>>>> Abstract:
>>>>> 
>>>>> The project aims to bring cloudformation[1] like service to cloudstack.
>>>>> One
>>>>> of the prime use-case is cluster computing frameworks on cloudstack. A
>>>>> cloudformation service will give users and administrators of cloudstack
>>>>> ability to manage and control a set of resources easily. The
>>>>> cloudformation
>>>>> will allow booting and configuring a set of VMs and form a cluster.
>>>>> Simple
>>>>> example would be LAMP stack. More complex clusters such as mesos or
>>>>> hadoop
>>>>> cluster requires a little more advanced configuration. There is already
>>>>> some work done by Chiradeep Vittal at this front [5] using route and
>>>> 
>>>> it's using ruote: http://ruote.rubyforge.org
>>>> 
>>>>> sinatra. In this project, I will implement cloudformation service and
>>>>> demonstrate how to run mesos cluster using it.
>>>> 
>>>> You will create cloud formation templates that describe a mesos cluster
>>>> 
>>>>> 
>>>>> Mesos:
>>>>> 
>>>>> Mesos is a resource management platform for clusters [2]. It aims to
>>>>> increase resource utilization of clusters by sharing cluster resources
>>>>> among multiple processing frameworks(like MapReduce, MPI, Graph
>>>>> Processing)
>>>>> or multiple instances of same framework. It provides efficient resource
>>>>> isolation through use of containers. Uses zookeeper for state
>>>>> maintenance
>>>>> and fault tolerance.
>>>>> 
>>>>> What can run on mesos ?
>>>>> 
>>>>> Spark: A cluster computing framework based on the Resilient Distributed
>>>>> Datasets (RDDs) abstraction. RDD is more generalized than MapReduce and
>>>>> can
>>>>> support iterative and interactive computation while retaining fault
>>>>> tolerance, scalability, data locality etc.
>>>>> 
>>>>> Hadoop: Hadoop is fault tolerant and scalable distributed computing
>>>>> framework based on MapReduce abstraction.
>>>>> 
>>>>> Begel: A graph processing framework based on pregel.
>>>>> 
>>>>> and other frameworks like MPI, Hypertable.
>>>>> 
>>>>> How to deploy mesos
>>>>> 
>>>>> Mesos provides cluster installation scripts [7] for cluster deployment.
>>>>> There are also scripts available to deploy a cluster on Amazon EC2 [8].
>>>> 
>>>> It would be nice to see if these scripts can be used as is with the
>>>> CloudStack EC2 service.
>>>> 
>>>>> 
>>>>> Deliverables:
>>>>> 
>>>>> 1. Cloudformation service implementation on cloudstack.
>>>>> 
>>>>> 2. Integration of cloudformation with cloudmonkey, CLI tool.
>>>> 
>>>> 2. is a little confusing. I believe that what Chiradeep prototype runs
>> on
>>>> the client side. What is needed is a server side implementation.
>>>> That way we could use existing cloudformation cli tools to talk to it.
>>>> I don't understand where cloudmonkey comes into play. CloudMonkey is a
>>>> cli for the CloudStack API. Unless you plan to integrate the
>>>> cloudformation API directly in the cloudstack source code, the
>>>> integration you propose is not clear to me.
>>>> 
>>> 
>>> Sebastien is correct. I intend to put in the query API server around the
>>> core of stack mate soon (as soon as I'm done helping on the internal
>>> loadbalancer). This will be written in Ruby.
>>> 
>>> 
>> 
>> Dharmesh I suggest you propose the following:
>> 
>> 1-Deploy CloudStack and understand instance configuration/contextualization
>> 2-Test and deploy Mesos on a set of CloudStack based VM, manually.
>> Design/propose an automation framework.
>> 3-Test stackmate and engage chiradeep (report bugs, make suggestion, make
>> pull request)
>> 4-Create cloud formation template to provision a Mesos Cluster
>> 5-Compare with Apache Whirr or other cluster provisioning tools.
>> 6-Potentially if you see a link with cloudmonkey, see how you could extend
>> it to talk to stackmate in a similar manner that it talks to CloudStack.
>> 
>> 
>> You are pretty close and this is a very exciting projects, so go ahead,
>> modify a bit your proposal and submit it.
>> 
>> Deadline for applications is this Friday May 3rd.
>> 
>> -sebastien
>> 
>> 
>> 
>> 


Re: [GSoC][Proposal] Integration project to deploy and use Mesos on a CloudStack based cloud

Posted by Sebastien Goasguen <ru...@gmail.com>.
Dharmesh,

I see your proposal in the google system, we will get back to you as soon as I know the next steps in the review process.

-sebastien

On May 1, 2013, at 7:27 AM, Dharmesh Kakadia <dh...@research.iiit.ac.in> wrote:

> Sebastien and Chiradeep, thanks for the comments !! That clarified a lot of
> things. I just read Chiradeep's blog (
> http://cloudierthanthou.wordpress.com/2013/04/26/stackmate-execute-cloudformation-templates-on-cloudstack/)
> which details the service.
> 
> I am proposing a server side implementation of cloudformation.
> 
> I misunderstood the ReST and Query API. Thanks for correcting. Information
> here(http://gehrcke.de/2009/06/aws-about-api/) helped me. In case we want
> to use existing AWS tools for cloudformation, we also would be designing
> Query API, not ReST.
> 
> Sorry for the confusion regarding cloudmonkey. I was proposing to integrate
> cloudformation API into cloudstack source code, directly and add
> corresponding support in cloudmonkey. But as you suggested, it might be
> easy to start with prototype decoupled from cloudstack (Uses cloudstack API
> and does not reside in cloudstack). I assume by existing cloudformation
> tools you mean AWS tools(
> http://aws.amazon.com/developertools/AWS-CloudFormation). Reusing them will
> be a really good idea.
> 
> There are lot of options for configuration mgmt tools. I have used knife
> previously and good to know that it has cloudstack plugin based on fog (
> https://github.com/fifthecho/knife-cloudstack-fog). Reasons rundeck looked
> better was support for rollbacking and is full workflow execution engine.
> Finally rundeck can use chef/puppet. I have seen provisonr/whirr and they
> look promising. Definitely a lot to explore here !!
> 
> Thanks for suggesting clear proposal.
> 
> Thanks,
> Dharmesh
> 
> 
> 
> 
> 
> On Wed, May 1, 2013 at 3:07 PM, Sebastien Goasguen <ru...@gmail.com> wrote:
> 
>> 
>> On Apr 30, 2013, at 4:59 PM, Chiradeep Vittal <Ch...@citrix.com>
>> wrote:
>> 
>>> 
>>> 
>>> On 4/30/13 5:01 AM, "Sebastien Goasguen" <ru...@gmail.com> wrote:
>>> 
>>>> Dharmesh, see in-line
>>>> 
>>>> On Apr 30, 2013, at 5:34 AM, Dharmesh Kakadia <dh...@gmail.com>
>> wrote:
>>>> 
>>>>> Hi,
>>>>> 
>>>>> I am Dharmesh Kakdia and interested in project "Integration project to
>>>>> deploy and use Mesos on a CloudStack based cloud" (
>>>>> https://issues.apache.org/jira/browse/CLOUDSTACK-1784)
>>>>> 
>>>>> I am working on proposal and want to get feedback. Please provide
>>>>> suggestions :)
>>>>> 
>>>>> *
>>>>> 
>>>>> Abstract:
>>>>> 
>>>>> The project aims to bring cloudformation[1] like service to cloudstack.
>>>>> One
>>>>> of the prime use-case is cluster computing frameworks on cloudstack. A
>>>>> cloudformation service will give users and administrators of cloudstack
>>>>> ability to manage and control a set of resources easily. The
>>>>> cloudformation
>>>>> will allow booting and configuring a set of VMs and form a cluster.
>>>>> Simple
>>>>> example would be LAMP stack. More complex clusters such as mesos or
>>>>> hadoop
>>>>> cluster requires a little more advanced configuration. There is already
>>>>> some work done by Chiradeep Vittal at this front [5] using route and
>>>> 
>>>> it's using ruote: http://ruote.rubyforge.org
>>>> 
>>>>> sinatra. In this project, I will implement cloudformation service and
>>>>> demonstrate how to run mesos cluster using it.
>>>> 
>>>> You will create cloud formation templates that describe a mesos cluster
>>>> 
>>>>> 
>>>>> Mesos:
>>>>> 
>>>>> Mesos is a resource management platform for clusters [2]. It aims to
>>>>> increase resource utilization of clusters by sharing cluster resources
>>>>> among multiple processing frameworks(like MapReduce, MPI, Graph
>>>>> Processing)
>>>>> or multiple instances of same framework. It provides efficient resource
>>>>> isolation through use of containers. Uses zookeeper for state
>>>>> maintenance
>>>>> and fault tolerance.
>>>>> 
>>>>> What can run on mesos ?
>>>>> 
>>>>> Spark: A cluster computing framework based on the Resilient Distributed
>>>>> Datasets (RDDs) abstraction. RDD is more generalized than MapReduce and
>>>>> can
>>>>> support iterative and interactive computation while retaining fault
>>>>> tolerance, scalability, data locality etc.
>>>>> 
>>>>> Hadoop: Hadoop is fault tolerant and scalable distributed computing
>>>>> framework based on MapReduce abstraction.
>>>>> 
>>>>> Begel: A graph processing framework based on pregel.
>>>>> 
>>>>> and other frameworks like MPI, Hypertable.
>>>>> 
>>>>> How to deploy mesos
>>>>> 
>>>>> Mesos provides cluster installation scripts [7] for cluster deployment.
>>>>> There are also scripts available to deploy a cluster on Amazon EC2 [8].
>>>> 
>>>> It would be nice to see if these scripts can be used as is with the
>>>> CloudStack EC2 service.
>>>> 
>>>>> 
>>>>> Deliverables:
>>>>> 
>>>>> 1. Cloudformation service implementation on cloudstack.
>>>>> 
>>>>> 2. Integration of cloudformation with cloudmonkey, CLI tool.
>>>> 
>>>> 2. is a little confusing. I believe that what Chiradeep prototype runs
>> on
>>>> the client side. What is needed is a server side implementation.
>>>> That way we could use existing cloudformation cli tools to talk to it.
>>>> I don't understand where cloudmonkey comes into play. CloudMonkey is a
>>>> cli for the CloudStack API. Unless you plan to integrate the
>>>> cloudformation API directly in the cloudstack source code, the
>>>> integration you propose is not clear to me.
>>>> 
>>> 
>>> Sebastien is correct. I intend to put in the query API server around the
>>> core of stack mate soon (as soon as I'm done helping on the internal
>>> loadbalancer). This will be written in Ruby.
>>> 
>>> 
>> 
>> Dharmesh I suggest you propose the following:
>> 
>> 1-Deploy CloudStack and understand instance configuration/contextualization
>> 2-Test and deploy Mesos on a set of CloudStack based VM, manually.
>> Design/propose an automation framework.
>> 3-Test stackmate and engage chiradeep (report bugs, make suggestion, make
>> pull request)
>> 4-Create cloud formation template to provision a Mesos Cluster
>> 5-Compare with Apache Whirr or other cluster provisioning tools.
>> 6-Potentially if you see a link with cloudmonkey, see how you could extend
>> it to talk to stackmate in a similar manner that it talks to CloudStack.
>> 
>> 
>> You are pretty close and this is a very exciting projects, so go ahead,
>> modify a bit your proposal and submit it.
>> 
>> Deadline for applications is this Friday May 3rd.
>> 
>> -sebastien
>> 
>> 
>> 
>> 


Re: [GSoC][Proposal] Integration project to deploy and use Mesos on a CloudStack based cloud

Posted by Dharmesh Kakadia <dh...@research.iiit.ac.in>.
Sebastien and Chiradeep, thanks for the comments !! That clarified a lot of
things. I just read Chiradeep's blog (
http://cloudierthanthou.wordpress.com/2013/04/26/stackmate-execute-cloudformation-templates-on-cloudstack/)
which details the service.

I am proposing a server side implementation of cloudformation.

I misunderstood the ReST and Query API. Thanks for correcting. Information
here(http://gehrcke.de/2009/06/aws-about-api/) helped me. In case we want
to use existing AWS tools for cloudformation, we also would be designing
Query API, not ReST.

Sorry for the confusion regarding cloudmonkey. I was proposing to integrate
cloudformation API into cloudstack source code, directly and add
corresponding support in cloudmonkey. But as you suggested, it might be
easy to start with prototype decoupled from cloudstack (Uses cloudstack API
and does not reside in cloudstack). I assume by existing cloudformation
tools you mean AWS tools(
http://aws.amazon.com/developertools/AWS-CloudFormation). Reusing them will
be a really good idea.

There are lot of options for configuration mgmt tools. I have used knife
previously and good to know that it has cloudstack plugin based on fog (
https://github.com/fifthecho/knife-cloudstack-fog). Reasons rundeck looked
better was support for rollbacking and is full workflow execution engine.
Finally rundeck can use chef/puppet. I have seen provisonr/whirr and they
look promising. Definitely a lot to explore here !!

Thanks for suggesting clear proposal.

Thanks,
Dharmesh





On Wed, May 1, 2013 at 3:07 PM, Sebastien Goasguen <ru...@gmail.com> wrote:

>
> On Apr 30, 2013, at 4:59 PM, Chiradeep Vittal <Ch...@citrix.com>
> wrote:
>
> >
> >
> > On 4/30/13 5:01 AM, "Sebastien Goasguen" <ru...@gmail.com> wrote:
> >
> >> Dharmesh, see in-line
> >>
> >> On Apr 30, 2013, at 5:34 AM, Dharmesh Kakadia <dh...@gmail.com>
> wrote:
> >>
> >>> Hi,
> >>>
> >>> I am Dharmesh Kakdia and interested in project "Integration project to
> >>> deploy and use Mesos on a CloudStack based cloud" (
> >>> https://issues.apache.org/jira/browse/CLOUDSTACK-1784)
> >>>
> >>> I am working on proposal and want to get feedback. Please provide
> >>> suggestions :)
> >>>
> >>> *
> >>>
> >>> Abstract:
> >>>
> >>> The project aims to bring cloudformation[1] like service to cloudstack.
> >>> One
> >>> of the prime use-case is cluster computing frameworks on cloudstack. A
> >>> cloudformation service will give users and administrators of cloudstack
> >>> ability to manage and control a set of resources easily. The
> >>> cloudformation
> >>> will allow booting and configuring a set of VMs and form a cluster.
> >>> Simple
> >>> example would be LAMP stack. More complex clusters such as mesos or
> >>> hadoop
> >>> cluster requires a little more advanced configuration. There is already
> >>> some work done by Chiradeep Vittal at this front [5] using route and
> >>
> >> it's using ruote: http://ruote.rubyforge.org
> >>
> >>> sinatra. In this project, I will implement cloudformation service and
> >>> demonstrate how to run mesos cluster using it.
> >>
> >> You will create cloud formation templates that describe a mesos cluster
> >>
> >>>
> >>> Mesos:
> >>>
> >>> Mesos is a resource management platform for clusters [2]. It aims to
> >>> increase resource utilization of clusters by sharing cluster resources
> >>> among multiple processing frameworks(like MapReduce, MPI, Graph
> >>> Processing)
> >>> or multiple instances of same framework. It provides efficient resource
> >>> isolation through use of containers. Uses zookeeper for state
> >>> maintenance
> >>> and fault tolerance.
> >>>
> >>> What can run on mesos ?
> >>>
> >>> Spark: A cluster computing framework based on the Resilient Distributed
> >>> Datasets (RDDs) abstraction. RDD is more generalized than MapReduce and
> >>> can
> >>> support iterative and interactive computation while retaining fault
> >>> tolerance, scalability, data locality etc.
> >>>
> >>> Hadoop: Hadoop is fault tolerant and scalable distributed computing
> >>> framework based on MapReduce abstraction.
> >>>
> >>> Begel: A graph processing framework based on pregel.
> >>>
> >>> and other frameworks like MPI, Hypertable.
> >>>
> >>> How to deploy mesos
> >>>
> >>> Mesos provides cluster installation scripts [7] for cluster deployment.
> >>> There are also scripts available to deploy a cluster on Amazon EC2 [8].
> >>
> >> It would be nice to see if these scripts can be used as is with the
> >> CloudStack EC2 service.
> >>
> >>>
> >>> Deliverables:
> >>>
> >>> 1. Cloudformation service implementation on cloudstack.
> >>>
> >>> 2. Integration of cloudformation with cloudmonkey, CLI tool.
> >>
> >> 2. is a little confusing. I believe that what Chiradeep prototype runs
> on
> >> the client side. What is needed is a server side implementation.
> >> That way we could use existing cloudformation cli tools to talk to it.
> >> I don't understand where cloudmonkey comes into play. CloudMonkey is a
> >> cli for the CloudStack API. Unless you plan to integrate the
> >> cloudformation API directly in the cloudstack source code, the
> >> integration you propose is not clear to me.
> >>
> >
> > Sebastien is correct. I intend to put in the query API server around the
> > core of stack mate soon (as soon as I'm done helping on the internal
> > loadbalancer). This will be written in Ruby.
> >
> >
>
> Dharmesh I suggest you propose the following:
>
> 1-Deploy CloudStack and understand instance configuration/contextualization
> 2-Test and deploy Mesos on a set of CloudStack based VM, manually.
> Design/propose an automation framework.
> 3-Test stackmate and engage chiradeep (report bugs, make suggestion, make
> pull request)
> 4-Create cloud formation template to provision a Mesos Cluster
> 5-Compare with Apache Whirr or other cluster provisioning tools.
> 6-Potentially if you see a link with cloudmonkey, see how you could extend
> it to talk to stackmate in a similar manner that it talks to CloudStack.
>
>
> You are pretty close and this is a very exciting projects, so go ahead,
> modify a bit your proposal and submit it.
>
> Deadline for applications is this Friday May 3rd.
>
> -sebastien
>
>
>
>

Re: [GSoC][Proposal] Integration project to deploy and use Mesos on a CloudStack based cloud

Posted by Dharmesh Kakadia <dh...@gmail.com>.
I am not sure wheather my earlier mail reached or was dropped (I got
failure notice!!). So resending. Sorry incase it is a duplicate.

------

Sebastien and Chiradeep, thanks for the comments !! That clarified a lot of
things. I just read Chiradeep's blog (
http://cloudierthanthou.wordpress.com/2013/04/26/stackmate-execute-cloudformation-templates-on-cloudstack/)
which details the service.

I am proposing a server side implementation of cloudformation.

I misunderstood the ReST and Query API. Thanks for correcting. Information
here(http://gehrcke.de/2009/06/aws-about-api/) helped me. In case we want
to use existing AWS tools for cloudformation, we also would be designing
Query API, not ReST.

Sorry for the confusion regarding cloudmonkey. I was proposing to integrate
cloudformation API into cloudstack source code, directly and add
corresponding support in cloudmonkey. But as you suggested, it might be
easy to start with prototype decoupled from cloudstack (Uses cloudstack API
and does not reside in cloudstack). I assume by existing cloudformation
tools you mean AWS tools(
http://aws.amazon.com/developertools/AWS-CloudFormation). Reusing them will
be a really good idea.

There are lot of options for configuration mgmt tools. I have used knife
previously and good to know that it has cloudstack plugin based on fog (
https://github.com/fifthecho/knife-cloudstack-fog). Reasons rundeck looked
better was support for rollbacking and is full workflow execution engine.
Finally rundeck can use chef/puppet. I have seen provisonr/whirr and they
look promising. Definitely a lot to explore here !!

Thanks for suggesting clear proposal. I have submitted the proposal.

Thanks,
Dharmesh


On Wed, May 1, 2013 at 3:07 PM, Sebastien Goasguen <ru...@gmail.com> wrote:

>
> On Apr 30, 2013, at 4:59 PM, Chiradeep Vittal <Ch...@citrix.com>
> wrote:
>
> >
> >
> > On 4/30/13 5:01 AM, "Sebastien Goasguen" <ru...@gmail.com> wrote:
> >
> >> Dharmesh, see in-line
> >>
> >> On Apr 30, 2013, at 5:34 AM, Dharmesh Kakadia <dh...@gmail.com>
> wrote:
> >>
> >>> Hi,
> >>>
> >>> I am Dharmesh Kakdia and interested in project "Integration project to
> >>> deploy and use Mesos on a CloudStack based cloud" (
> >>> https://issues.apache.org/jira/browse/CLOUDSTACK-1784)
> >>>
> >>> I am working on proposal and want to get feedback. Please provide
> >>> suggestions :)
> >>>
> >>> *
> >>>
> >>> Abstract:
> >>>
> >>> The project aims to bring cloudformation[1] like service to cloudstack.
> >>> One
> >>> of the prime use-case is cluster computing frameworks on cloudstack. A
> >>> cloudformation service will give users and administrators of cloudstack
> >>> ability to manage and control a set of resources easily. The
> >>> cloudformation
> >>> will allow booting and configuring a set of VMs and form a cluster.
> >>> Simple
> >>> example would be LAMP stack. More complex clusters such as mesos or
> >>> hadoop
> >>> cluster requires a little more advanced configuration. There is already
> >>> some work done by Chiradeep Vittal at this front [5] using route and
> >>
> >> it's using ruote: http://ruote.rubyforge.org
> >>
> >>> sinatra. In this project, I will implement cloudformation service and
> >>> demonstrate how to run mesos cluster using it.
> >>
> >> You will create cloud formation templates that describe a mesos cluster
> >>
> >>>
> >>> Mesos:
> >>>
> >>> Mesos is a resource management platform for clusters [2]. It aims to
> >>> increase resource utilization of clusters by sharing cluster resources
> >>> among multiple processing frameworks(like MapReduce, MPI, Graph
> >>> Processing)
> >>> or multiple instances of same framework. It provides efficient resource
> >>> isolation through use of containers. Uses zookeeper for state
> >>> maintenance
> >>> and fault tolerance.
> >>>
> >>> What can run on mesos ?
> >>>
> >>> Spark: A cluster computing framework based on the Resilient Distributed
> >>> Datasets (RDDs) abstraction. RDD is more generalized than MapReduce and
> >>> can
> >>> support iterative and interactive computation while retaining fault
> >>> tolerance, scalability, data locality etc.
> >>>
> >>> Hadoop: Hadoop is fault tolerant and scalable distributed computing
> >>> framework based on MapReduce abstraction.
> >>>
> >>> Begel: A graph processing framework based on pregel.
> >>>
> >>> and other frameworks like MPI, Hypertable.
> >>>
> >>> How to deploy mesos
> >>>
> >>> Mesos provides cluster installation scripts [7] for cluster deployment.
> >>> There are also scripts available to deploy a cluster on Amazon EC2 [8].
> >>
> >> It would be nice to see if these scripts can be used as is with the
> >> CloudStack EC2 service.
> >>
> >>>
> >>> Deliverables:
> >>>
> >>> 1. Cloudformation service implementation on cloudstack.
> >>>
> >>> 2. Integration of cloudformation with cloudmonkey, CLI tool.
> >>
> >> 2. is a little confusing. I believe that what Chiradeep prototype runs
> on
> >> the client side. What is needed is a server side implementation.
> >> That way we could use existing cloudformation cli tools to talk to it.
> >> I don't understand where cloudmonkey comes into play. CloudMonkey is a
> >> cli for the CloudStack API. Unless you plan to integrate the
> >> cloudformation API directly in the cloudstack source code, the
> >> integration you propose is not clear to me.
> >>
> >
> > Sebastien is correct. I intend to put in the query API server around the
> > core of stack mate soon (as soon as I'm done helping on the internal
> > loadbalancer). This will be written in Ruby.
> >
> >
>
> Dharmesh I suggest you propose the following:
>
> 1-Deploy CloudStack and understand instance configuration/contextualization
> 2-Test and deploy Mesos on a set of CloudStack based VM, manually.
> Design/propose an automation framework.
> 3-Test stackmate and engage chiradeep (report bugs, make suggestion, make
> pull request)
> 4-Create cloud formation template to provision a Mesos Cluster
> 5-Compare with Apache Whirr or other cluster provisioning tools.
> 6-Potentially if you see a link with cloudmonkey, see how you could extend
> it to talk to stackmate in a similar manner that it talks to CloudStack.
>
>
> You are pretty close and this is a very exciting projects, so go ahead,
> modify a bit your proposal and submit it.
>
> Deadline for applications is this Friday May 3rd.
>
> -sebastien
>
>
>
>

Re: [GSoC][Proposal] Integration project to deploy and use Mesos on a CloudStack based cloud

Posted by Sebastien Goasguen <ru...@gmail.com>.
On Apr 30, 2013, at 4:59 PM, Chiradeep Vittal <Ch...@citrix.com> wrote:

> 
> 
> On 4/30/13 5:01 AM, "Sebastien Goasguen" <ru...@gmail.com> wrote:
> 
>> Dharmesh, see in-line
>> 
>> On Apr 30, 2013, at 5:34 AM, Dharmesh Kakadia <dh...@gmail.com> wrote:
>> 
>>> Hi,
>>> 
>>> I am Dharmesh Kakdia and interested in project "Integration project to
>>> deploy and use Mesos on a CloudStack based cloud" (
>>> https://issues.apache.org/jira/browse/CLOUDSTACK-1784)
>>> 
>>> I am working on proposal and want to get feedback. Please provide
>>> suggestions :)
>>> 
>>> *
>>> 
>>> Abstract:
>>> 
>>> The project aims to bring cloudformation[1] like service to cloudstack.
>>> One
>>> of the prime use-case is cluster computing frameworks on cloudstack. A
>>> cloudformation service will give users and administrators of cloudstack
>>> ability to manage and control a set of resources easily. The
>>> cloudformation
>>> will allow booting and configuring a set of VMs and form a cluster.
>>> Simple
>>> example would be LAMP stack. More complex clusters such as mesos or
>>> hadoop
>>> cluster requires a little more advanced configuration. There is already
>>> some work done by Chiradeep Vittal at this front [5] using route and
>> 
>> it's using ruote: http://ruote.rubyforge.org
>> 
>>> sinatra. In this project, I will implement cloudformation service and
>>> demonstrate how to run mesos cluster using it.
>> 
>> You will create cloud formation templates that describe a mesos cluster
>> 
>>> 
>>> Mesos:
>>> 
>>> Mesos is a resource management platform for clusters [2]. It aims to
>>> increase resource utilization of clusters by sharing cluster resources
>>> among multiple processing frameworks(like MapReduce, MPI, Graph
>>> Processing)
>>> or multiple instances of same framework. It provides efficient resource
>>> isolation through use of containers. Uses zookeeper for state
>>> maintenance
>>> and fault tolerance.
>>> 
>>> What can run on mesos ?
>>> 
>>> Spark: A cluster computing framework based on the Resilient Distributed
>>> Datasets (RDDs) abstraction. RDD is more generalized than MapReduce and
>>> can
>>> support iterative and interactive computation while retaining fault
>>> tolerance, scalability, data locality etc.
>>> 
>>> Hadoop: Hadoop is fault tolerant and scalable distributed computing
>>> framework based on MapReduce abstraction.
>>> 
>>> Begel: A graph processing framework based on pregel.
>>> 
>>> and other frameworks like MPI, Hypertable.
>>> 
>>> How to deploy mesos
>>> 
>>> Mesos provides cluster installation scripts [7] for cluster deployment.
>>> There are also scripts available to deploy a cluster on Amazon EC2 [8].
>> 
>> It would be nice to see if these scripts can be used as is with the
>> CloudStack EC2 service.
>> 
>>> 
>>> Deliverables:
>>> 
>>> 1. Cloudformation service implementation on cloudstack.
>>> 
>>> 2. Integration of cloudformation with cloudmonkey, CLI tool.
>> 
>> 2. is a little confusing. I believe that what Chiradeep prototype runs on
>> the client side. What is needed is a server side implementation.
>> That way we could use existing cloudformation cli tools to talk to it.
>> I don't understand where cloudmonkey comes into play. CloudMonkey is a
>> cli for the CloudStack API. Unless you plan to integrate the
>> cloudformation API directly in the cloudstack source code, the
>> integration you propose is not clear to me.
>> 
> 
> Sebastien is correct. I intend to put in the query API server around the
> core of stack mate soon (as soon as I'm done helping on the internal
> loadbalancer). This will be written in Ruby.
> 
> 

Dharmesh I suggest you propose the following:

1-Deploy CloudStack and understand instance configuration/contextualization
2-Test and deploy Mesos on a set of CloudStack based VM, manually. Design/propose an automation framework.
3-Test stackmate and engage chiradeep (report bugs, make suggestion, make pull request)
4-Create cloud formation template to provision a Mesos Cluster
5-Compare with Apache Whirr or other cluster provisioning tools.
6-Potentially if you see a link with cloudmonkey, see how you could extend it to talk to stackmate in a similar manner that it talks to CloudStack.


You are pretty close and this is a very exciting projects, so go ahead, modify a bit your proposal and submit it.

Deadline for applications is this Friday May 3rd.

-sebastien




Re: [GSoC][Proposal] Integration project to deploy and use Mesos on a CloudStack based cloud

Posted by Chiradeep Vittal <Ch...@citrix.com>.

On 4/30/13 5:01 AM, "Sebastien Goasguen" <ru...@gmail.com> wrote:

>Dharmesh, see in-line
>
>On Apr 30, 2013, at 5:34 AM, Dharmesh Kakadia <dh...@gmail.com> wrote:
>
>> Hi,
>> 
>> I am Dharmesh Kakdia and interested in project "Integration project to
>> deploy and use Mesos on a CloudStack based cloud" (
>> https://issues.apache.org/jira/browse/CLOUDSTACK-1784)
>> 
>> I am working on proposal and want to get feedback. Please provide
>> suggestions :)
>> 
>> *
>> 
>> Abstract:
>> 
>> The project aims to bring cloudformation[1] like service to cloudstack.
>>One
>> of the prime use-case is cluster computing frameworks on cloudstack. A
>> cloudformation service will give users and administrators of cloudstack
>> ability to manage and control a set of resources easily. The
>>cloudformation
>> will allow booting and configuring a set of VMs and form a cluster.
>>Simple
>> example would be LAMP stack. More complex clusters such as mesos or
>>hadoop
>> cluster requires a little more advanced configuration. There is already
>> some work done by Chiradeep Vittal at this front [5] using route and
>
>it's using ruote: http://ruote.rubyforge.org
>
>> sinatra. In this project, I will implement cloudformation service and
>> demonstrate how to run mesos cluster using it.
>
>You will create cloud formation templates that describe a mesos cluster
>
>> 
>> Mesos:
>> 
>> Mesos is a resource management platform for clusters [2]. It aims to
>> increase resource utilization of clusters by sharing cluster resources
>> among multiple processing frameworks(like MapReduce, MPI, Graph
>>Processing)
>> or multiple instances of same framework. It provides efficient resource
>> isolation through use of containers. Uses zookeeper for state
>>maintenance
>> and fault tolerance.
>> 
>> What can run on mesos ?
>> 
>> Spark: A cluster computing framework based on the Resilient Distributed
>> Datasets (RDDs) abstraction. RDD is more generalized than MapReduce and
>>can
>> support iterative and interactive computation while retaining fault
>> tolerance, scalability, data locality etc.
>> 
>> Hadoop: Hadoop is fault tolerant and scalable distributed computing
>> framework based on MapReduce abstraction.
>> 
>> Begel: A graph processing framework based on pregel.
>> 
>> and other frameworks like MPI, Hypertable.
>> 
>> How to deploy mesos
>> 
>> Mesos provides cluster installation scripts [7] for cluster deployment.
>> There are also scripts available to deploy a cluster on Amazon EC2 [8].
>
>It would be nice to see if these scripts can be used as is with the
>CloudStack EC2 service.
>
>> 
>> Deliverables:
>> 
>> 1. Cloudformation service implementation on cloudstack.
>> 
>> 2. Integration of cloudformation with cloudmonkey, CLI tool.
>
>2. is a little confusing. I believe that what Chiradeep prototype runs on
>the client side. What is needed is a server side implementation.
>That way we could use existing cloudformation cli tools to talk to it.
>I don't understand where cloudmonkey comes into play. CloudMonkey is a
>cli for the CloudStack API. Unless you plan to integrate the
>cloudformation API directly in the cloudstack source code, the
>integration you propose is not clear to me.
>

Sebastien is correct. I intend to put in the query API server around the
core of stack mate soon (as soon as I'm done helping on the internal
loadbalancer). This will be written in Ruby.



Re: [GSoC][Proposal] Integration project to deploy and use Mesos on a CloudStack based cloud

Posted by Sebastien Goasguen <ru...@gmail.com>.
Dharmesh, see in-line

On Apr 30, 2013, at 5:34 AM, Dharmesh Kakadia <dh...@gmail.com> wrote:

> Hi,
> 
> I am Dharmesh Kakdia and interested in project "Integration project to
> deploy and use Mesos on a CloudStack based cloud" (
> https://issues.apache.org/jira/browse/CLOUDSTACK-1784)
> 
> I am working on proposal and want to get feedback. Please provide
> suggestions :)
> 
> *
> 
> Abstract:
> 
> The project aims to bring cloudformation[1] like service to cloudstack. One
> of the prime use-case is cluster computing frameworks on cloudstack. A
> cloudformation service will give users and administrators of cloudstack
> ability to manage and control a set of resources easily. The cloudformation
> will allow booting and configuring a set of VMs and form a cluster. Simple
> example would be LAMP stack. More complex clusters such as mesos or hadoop
> cluster requires a little more advanced configuration. There is already
> some work done by Chiradeep Vittal at this front [5] using route and

it's using ruote: http://ruote.rubyforge.org

> sinatra. In this project, I will implement cloudformation service and
> demonstrate how to run mesos cluster using it.

You will create cloud formation templates that describe a mesos cluster

> 
> Mesos:
> 
> Mesos is a resource management platform for clusters [2]. It aims to
> increase resource utilization of clusters by sharing cluster resources
> among multiple processing frameworks(like MapReduce, MPI, Graph Processing)
> or multiple instances of same framework. It provides efficient resource
> isolation through use of containers. Uses zookeeper for state maintenance
> and fault tolerance.
> 
> What can run on mesos ?
> 
> Spark: A cluster computing framework based on the Resilient Distributed
> Datasets (RDDs) abstraction. RDD is more generalized than MapReduce and can
> support iterative and interactive computation while retaining fault
> tolerance, scalability, data locality etc.
> 
> Hadoop: Hadoop is fault tolerant and scalable distributed computing
> framework based on MapReduce abstraction.
> 
> Begel: A graph processing framework based on pregel.
> 
> and other frameworks like MPI, Hypertable.
> 
> How to deploy mesos
> 
> Mesos provides cluster installation scripts [7] for cluster deployment.
> There are also scripts available to deploy a cluster on Amazon EC2 [8].

It would be nice to see if these scripts can be used as is with the CloudStack EC2 service.

> 
> Deliverables:
> 
> 1. Cloudformation service implementation on cloudstack.
> 
> 2. Integration of cloudformation with cloudmonkey, CLI tool.

2. is a little confusing. I believe that what Chiradeep prototype runs on the client side. What is needed is a server side implementation.
That way we could use existing cloudformation cli tools to talk to it.
I don't understand where cloudmonkey comes into play. CloudMonkey is a cli for the CloudStack API. Unless you plan to integrate the cloudformation API directly in the cloudstack source code, the integration you propose is not clear to me.


> 
> 2. Proof of concept of running mesos on top of cloudstack using the service.
> 
> 3. Related documentation.
> 
> Architecture and Tools:
> 
> The high level architecture I propose is as follows:
> 
>  It includes following components:
> 
> 1. CloudFormation ReST server:
> 
> This acts as a point of contact to and exposes CloudFormation functionality
> as ReST service.

I believe CloudFormation is really a Query API.

> This can be accessed directly or through cloudmonkey. I
> will add those functionalities in cloudmonkey. I plan to use dropwizard [3]
> to start with. Later may be the API server can be merged with management
> server. I plan to use mysql for storing details of clusters.

At first, you could do a prototype that is decoupled from CloudStack. You need to clarify the integration with CloudMonkey.

> 
> 2. Provisioning:
> 
> Provisioning module is responsible for handling the booting process of the
> VMs through cloudstack. This uses the cloudstack APIs for launching VMs. I
> plan to use preconfigured templates/images with required dependencies
> installed, which will make cluster creation process much faster even for
> large clusters. Error handling is very important part of this module. For
> example, what you do if few VMs fail to boot in cluster ?
> 
> 3. Configuration:
> 
> This module deals with configuring the VMs to form a cluster. This can be
> done via manual scripts/code or via configuration management tools like
> chef. I plan to use workflow automation tools like rundeck [4].

knife-cloudstack provides chef/cloudstack provisioning. You may want to have a look at this.
I would prefer seeing chef or puppet recipes for Mesos (they probably already exist), rather than rundeck.
However if you do want to use rundeck, check the Apache incubator project: Provisionr , I know they use it to provision Hadoop.

> 
> In general, I want to use tools around java as much as possible as
> cloudstack is mostly in java. This will make the project easier to maintain
> and develop.
> 
> Why ReST ?
> 
> I believe decoupling provided by the ReST architecture makes it easy to
> extend in future.  Say for example, if one wants to extend the
> cloudformation service to include features like auto-scaling of clusters
> based on some user criteria (rule-based/monitoring etc).
> 

We need to clarify why you want a REST service. If I understand correctly you want to provide a server side implementation of what Chiradeep has started with stackmate. However I believe that the CloudFormation is really a Query API, so REST may not be needed and could provide interoperability issues with existing CloudFormation tools. Also we need to clarify how you plan to integrate with cloudmonkey.

If you integrate your CloudFormation API tightly with the mgt server then cloudmonkey will be able to discover them automatically, but otherwise I don't see the link.

It might be easier to create a server side implementation of stackmate (and we would need Chiradeep input on that, I cc him), then create mesos cluster CF templates. This server would talk directly to an unmodified CloudStack mgt server.


thanks, this is very exciting.

-sebastien

> Services:
> 
> 1. POST : create a cluster
> 
>   -
> 
>      accepts : cluster configuration json
>      -
> 
>      produces : clusterId
> 
> 2. GET : get the current status of request
> 
>   -
> 
>      accepts : clusterId
>      -
> 
>      produces : json describing current status if the cluster.
> 
> 3. DELETE : remove a cluster
> 
>   -
> 
>      accepts : clusterId
>      -
> 
>      produces : result (sucess/failure)
> 
> 4. UPDATE : adding a node to a cluster
> 
>   -
> 
>      accepts : cluster configuration json and clusterId
>      -
> 
>      produces : result (sucess/failure)
> 
> 
> Timeline:
> 
> 1-1.5 week : project design. Architecture, tools selection, API design.
> 
> 1-1.5 week : getting familiar with cloudstack codebase and architecture
> details.
> 
> 1-1.5 week : getting familiar with mesos internals.
> 
> 1-1.5 week : setting up the dev environment
> 
> 2-3 week : build provisioning and configuration module
> 
> Midterm evaluation: provisioning module, configuration module
> 
> 1-2 week : develope ReST server
> 
> 2-3 week : test and integrate
> 
> About me:
> 
> I am MS by Research student at International Institute of Information
> Technology Hyderabad (IIIT-H), Hyderabad, India. I operate our small lab
> cluster operating on Openstack and I am working on a similar project,
> HadoopStack [6], which aims to bring data processing to a multi-cloud
> environment (work in progress). My area of research is scheduling in large
> scale distributed systems. I have experience with related tools like
> Hadoop, Mesos, OpenStack, Chef, Ironfan and jClouds.
> 
> Email-contact : dhkakadia@gmail.com
> 
> More info: http://researchweb.iiit.ac.in/~dharmesh.kakadia/
> 
> Why me ?
> 
> I love open-source projects. I am fascinated by distributed computing and
> interested in building and optimizing large scale systems and data
> processing frameworks.
> 
> References
> 
> [1] http://aws.amazon.com/cloudformation/
> 
> [2] http://incubator.apache.org/mesos/
> 
> [3] http://dropwizard.codahale.com/
> 
> [4] http://rundeck.org/
> 
> [5] https://github.com/chiradeep/stackmate
> 
> [6] http://siel-iiith.github.io/HadoopStack/
> 
> [7] https://github.com/apache/mesos/blob/trunk/docs/Deploy-Scripts.textile
> 
> [8] https://github.com/apache/mesos/blob/trunk/docs/EC2-Scripts.textile
> **
> 
> In case you are having trouble in reading, google docs of above is here :
> 
> https://docs.google.com/document/d/1ocoBmyHDtOVnBhCELVt1QcgkubSzCyksls2MCTuDPL0
> 
> *