You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cloudstack.apache.org by Gabriel Beims Bräscher <ga...@gmail.com> on 2016/04/11 17:44:31 UTC

How do you manage/improve your ACS environment?

Dear Apache CloudStack users,

I have been discussing with some colleagues about solutions that can help
us to manage our Apache CloudStack (ACS) environments. By management I
mean, dealing with service level agreements (SLAs), management of idle
hosts, the balancing of the environment's virtual machine loads, and the
monitoring and tracking of resource usage (not only allocation).

Have you guys dealt with those situations and any other that you would
consider as a day-to-day management situation? How did you work them out?

It would be helpful to hear back your thoughts.

Thanks for your time,
Gabriel.

Re: How do you manage/improve your ACS environment?

Posted by ilya <il...@gmail.com>.
Gabriel,

In regards to operation issues, management and SLA(s), there are several
initiatives that come to mind:

1) Rewrite and enhance CloudStack HA (being worked on) - the specs are
posted on Confluence, targeting KVM  primarily, as Xen and VmWare have
this problem solved

2) Distributed Resource Scheduler (work begins in 2 months) - similar to
VmWare DRS, except its less sophisticated and plug-gable. It will move
the VMs within your cluster to host with least usage, it can be extended
to shutdown idle hosts and bring them up - when needed, reference
feature #3 below

3) IPMI Support (completed) - ability to issue power level commands
through hosts IPMI interface (ILO,DRAC, etc...)

4) Usage Metrics View - enhancement to UI to show current usage and hot
spots in your env. Already available...

More to come, this is what i know off so far..

Regards
ilya



On 4/11/16 8:44 AM, Gabriel Beims Bräscher wrote:
> Dear Apache CloudStack users,
> 
> I have been discussing with some colleagues about solutions that can help
> us to manage our Apache CloudStack (ACS) environments. By management I
> mean, dealing with service level agreements (SLAs), management of idle
> hosts, the balancing of the environment's virtual machine loads, and the
> monitoring and tracking of resource usage (not only allocation).
> 
> Have you guys dealt with those situations and any other that you would
> consider as a day-to-day management situation? How did you work them out?
> 
> It would be helpful to hear back your thoughts.
> 
> Thanks for your time,
> Gabriel.
> 


Re: How do you manage/improve your ACS environment?

Posted by ilya <il...@gmail.com>.
Rafael

Please see response in-line:

On 4/12/16 3:15 PM, Rafael Weingärtner wrote:
> Ilya that is interesting.
> By multiple CloudStack environments, you mean environments that do not have
> any link between them? I mean, they are not “regions” of one another or
> something like that.
> 
> Do you intend to manage configurations such as over-provisioning factors,
> allocation threshold, Wait timeouts and others? What else do you look
> forward to manage?

Generally, a decent size implementation will have 1 or more CloudStacks
in single datacenter. There can be many datacenters, therefore
cloudstack infrastructure grows rather quickly. It is also very common
to front CloudStack with 1 common dashboard. The challenge arise when
all of the CloudStack must have identical configurations.


One way to do it - is to integrate CloudStack configs with your main
dashboard and let it manage all aspects of configuration, another way -
would be to have set of scripts that manage all your environments and
keep them in sync.

Step above scripted approach - would be something like a rest-service
that can manage all of your configurations from single end point. This
implies, users, templates (which always change), zone and global
settings, service and disk offerings, user roles, etc.

There is also a notion of profiles, that is - I may have set of
cloudstacks i use for production X and another for prodcution Y. The
configs and settings maybe different for X and Y cloudstack
environments. Yet Y environment may have 10 cloudstack in different
datacenters and X - 20 cloudstacks in other datacenters.

> I understand the point that now your employer may not want to open it; but,
> maybe in the future ;)

Maybe in the future, we are working on changing it from within - its an
uphill battle. There are many things we want to release to community -
that we cant at the moment.

> 
> We have some other questions here:
> For instance, the balancing of VMs workloads in the cloud environment; of
> course,  one can extend the CloudStack allocation algorithms and try to
> place VMs in a balancing manner. However, that would just be based on the
> resource allocation. I believe that the balancing/dispersion of VMs should
> also consider the resource usage. The problem is that Cloud computing is a
> rather dynamic environment; workloads can change their pattern of resource
> usage all the time. Therefore, executing the VMs balancing only at the
> deploy time would not be the best option. Maybe some agents that work
> autonomously managing the environment would be a better approach. We also
> have the size of such environments as a huge challenge; the agent would no
> be able to gather information, analyse it and then act upon the environment
> at once. It would be better to act upon pieces of the environment at time
> 
> Another example is the number of idle hosts in cloud environments. Giving
> the dynamic nature of the cloud, sometimes the load is pretty high while
> other times it is pretty low. For example, in my environment (right now)
> the allocation ratio of memory is 91% and CPU 94%. However, there are at
> least 5-6 months a year that we have ratios below 60%. That means that we
> have idle servers wasting energy and cooling. Again we have the same
> problem, a dynamic environment in which we never know for sure when a load
> is coming. For this solution, an agent that constantly acts on the
> environment would be interesting. This agent could detect unused servers
> and power them off; then, if a shortage of resource happens, we could start
> deactivated servers. This way we could save some great deal of energy.
> 
> Has anyone of you here thought/dealt with such situations? Do you care
> about such problems?
> 

This use-case is very common and the feature i talked about in another
email thread will try to solve this issue. For now, there is UI Metrics
enhancement that can help you identify hot spots, in the near future, we
will work on simplified version of VmWare's Distributed Resource
Scheduler. The idea behind this feature is to make it pluggable,
allowing CloudStack Admin to write set of rules by which cloudstack will
try to shift the workload. We will have few sample load balancing
algorithms that would satisfy most common use cases, but if you want to
get creative, you should be able to write a plugin with rules that makes
sense to you and your environment.


If you want to solve this problem now, you can easily write a script
that will analyze your environment and move VMs as needed. May not be
super efficient - but better than moving vms by hand.

Regards
ilya

> 
> On Tue, Apr 12, 2016 at 1:18 PM, ilya <il...@gmail.com> wrote:
> 
>> I started working on another project that will consolidate multiple
>> cloudstack into "1" to manage and check the health of the dispersed
>> cloudstack environments. Conceptually, anything i do through "CloudStack
>> Manager" would be replayed down to all other dispersed cloudstacks
>> across the globe.
>>
>> It would be relatively easy to write "CloudStack Manager" since
>> cloudstack is API driven via Rest-like interface.
>>
>> Unfortunately, i don't believe my employer would let me open source this
>> project, though i will try.
>>
>>
>>
>> On 4/11/16 8:44 AM, Gabriel Beims Bräscher wrote:
>>> Dear Apache CloudStack users,
>>>
>>> I have been discussing with some colleagues about solutions that can help
>>> us to manage our Apache CloudStack (ACS) environments. By management I
>>> mean, dealing with service level agreements (SLAs), management of idle
>>> hosts, the balancing of the environment's virtual machine loads, and the
>>> monitoring and tracking of resource usage (not only allocation).
>>>
>>> Have you guys dealt with those situations and any other that you would
>>> consider as a day-to-day management situation? How did you work them out?
>>>
>>> It would be helpful to hear back your thoughts.
>>>
>>> Thanks for your time,
>>> Gabriel.
>>>
>>
> 
> 
> 

Re: How do you manage/improve your ACS environment?

Posted by Rafael Weingärtner <ra...@gmail.com>.
Ilya that is interesting.
By multiple CloudStack environments, you mean environments that do not have
any link between them? I mean, they are not “regions” of one another or
something like that.

Do you intend to manage configurations such as over-provisioning factors,
allocation threshold, Wait timeouts and others? What else do you look
forward to manage?

I understand the point that now your employer may not want to open it; but,
maybe in the future ;)

We have some other questions here:
For instance, the balancing of VMs workloads in the cloud environment; of
course,  one can extend the CloudStack allocation algorithms and try to
place VMs in a balancing manner. However, that would just be based on the
resource allocation. I believe that the balancing/dispersion of VMs should
also consider the resource usage. The problem is that Cloud computing is a
rather dynamic environment; workloads can change their pattern of resource
usage all the time. Therefore, executing the VMs balancing only at the
deploy time would not be the best option. Maybe some agents that work
autonomously managing the environment would be a better approach. We also
have the size of such environments as a huge challenge; the agent would no
be able to gather information, analyse it and then act upon the environment
at once. It would be better to act upon pieces of the environment at time

Another example is the number of idle hosts in cloud environments. Giving
the dynamic nature of the cloud, sometimes the load is pretty high while
other times it is pretty low. For example, in my environment (right now)
the allocation ratio of memory is 91% and CPU 94%. However, there are at
least 5-6 months a year that we have ratios below 60%. That means that we
have idle servers wasting energy and cooling. Again we have the same
problem, a dynamic environment in which we never know for sure when a load
is coming. For this solution, an agent that constantly acts on the
environment would be interesting. This agent could detect unused servers
and power them off; then, if a shortage of resource happens, we could start
deactivated servers. This way we could save some great deal of energy.

Has anyone of you here thought/dealt with such situations? Do you care
about such problems?


On Tue, Apr 12, 2016 at 1:18 PM, ilya <il...@gmail.com> wrote:

> I started working on another project that will consolidate multiple
> cloudstack into "1" to manage and check the health of the dispersed
> cloudstack environments. Conceptually, anything i do through "CloudStack
> Manager" would be replayed down to all other dispersed cloudstacks
> across the globe.
>
> It would be relatively easy to write "CloudStack Manager" since
> cloudstack is API driven via Rest-like interface.
>
> Unfortunately, i don't believe my employer would let me open source this
> project, though i will try.
>
>
>
> On 4/11/16 8:44 AM, Gabriel Beims Bräscher wrote:
> > Dear Apache CloudStack users,
> >
> > I have been discussing with some colleagues about solutions that can help
> > us to manage our Apache CloudStack (ACS) environments. By management I
> > mean, dealing with service level agreements (SLAs), management of idle
> > hosts, the balancing of the environment's virtual machine loads, and the
> > monitoring and tracking of resource usage (not only allocation).
> >
> > Have you guys dealt with those situations and any other that you would
> > consider as a day-to-day management situation? How did you work them out?
> >
> > It would be helpful to hear back your thoughts.
> >
> > Thanks for your time,
> > Gabriel.
> >
>



-- 
Rafael Weingärtner

Re: How do you manage/improve your ACS environment?

Posted by ilya <il...@gmail.com>.
I started working on another project that will consolidate multiple
cloudstack into "1" to manage and check the health of the dispersed
cloudstack environments. Conceptually, anything i do through "CloudStack
Manager" would be replayed down to all other dispersed cloudstacks
across the globe.

It would be relatively easy to write "CloudStack Manager" since
cloudstack is API driven via Rest-like interface.

Unfortunately, i don't believe my employer would let me open source this
project, though i will try.



On 4/11/16 8:44 AM, Gabriel Beims Bräscher wrote:
> Dear Apache CloudStack users,
> 
> I have been discussing with some colleagues about solutions that can help
> us to manage our Apache CloudStack (ACS) environments. By management I
> mean, dealing with service level agreements (SLAs), management of idle
> hosts, the balancing of the environment's virtual machine loads, and the
> monitoring and tracking of resource usage (not only allocation).
> 
> Have you guys dealt with those situations and any other that you would
> consider as a day-to-day management situation? How did you work them out?
> 
> It would be helpful to hear back your thoughts.
> 
> Thanks for your time,
> Gabriel.
> 

Re: How do you manage/improve your ACS environment?

Posted by ilya <il...@gmail.com>.
Gabriel

What i mentioning is all going to be donated to ACS (and kindly
developed by great team @ ShapeBlue).

We have many more things in pipeline to make ACS better - just cant
speak of them as we haven't finalized the internal feature roadmap.

CloudStack Manager will be developed in house initially and if all stars
align we will work with ShapeBlue to make community version available.

Regards
ilya






On 4/13/16 11:22 AM, Gabriel Beims Bräscher wrote:
> Ilya
> 
> Everything sounds very interesting. I am happy to hear about projects
> working towards the improvement of ACS. Are there perspectives of this
> being donated for the ACS project?
> 
> Just curious, the usage metrics view you mentioned is this related to the
> work presented by Rohit in [
> https://cwiki.apache.org/confluence/display/CLOUDSTACK/Metrics+Views+for+CloudStack+UI
> <https://www.google.com/url?q=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FCLOUDSTACK%2FMetrics%2BViews%2Bfor%2BCloudStack%2BUI&sa=D&sntz=1&usg=AFQjCNEr3B8PwmIgXSa-Xl_eH6ENVPmDog>
> ]?
> 
> Regards
> 
> 2016-04-13 15:06 GMT-03:00 ilya <il...@gmail.com>:
> 
>> Gabriel,
>>
>> In regards to operation issues and SLA(s), there are several initiatives
>> that come to mind:
>>
>> 1) Rewrite and enhance CloudStack HA (being worked on) - the specs are
>> posted on Confluence, targeting KVM  primarily, as Xen and VmWare have
>> this problem solved
>>
>> 2) Distributed Resource Scheduler (work begins in 2 months) - similar to
>> VmWare DRS, except its less sophisticated and plug-gable. It will move
>> the VMs within your cluster to host with least usage, it can be extended
>> to shutdown idle hosts and bring them up - when needed, reference
>> feature #3 below
>>
>> 3) IPMI Support (completed) - ability to issue power level commands
>> through hosts IPMI interface (ILO,DRAC, etc...)
>>
>> 4) Usage Metrics View - enhancement to UI to show current usage and hot
>> spots in your env.
>>
>>
>>
>> On 4/11/16 8:44 AM, Gabriel Beims Bräscher wrote:
>>> Dear Apache CloudStack users,
>>>
>>> I have been discussing with some colleagues about solutions that can help
>>> us to manage our Apache CloudStack (ACS) environments. By management I
>>> mean, dealing with service level agreements (SLAs), management of idle
>>> hosts, the balancing of the environment's virtual machine loads, and the
>>> monitoring and tracking of resource usage (not only allocation).
>>>
>>> Have you guys dealt with those situations and any other that you would
>>> consider as a day-to-day management situation? How did you work them out?
>>>
>>> It would be helpful to hear back your thoughts.
>>>
>>> Thanks for your time,
>>> Gabriel.
>>>
>>
>>
> 

Re: How do you manage/improve your ACS environment?

Posted by Gabriel Beims Bräscher <ga...@gmail.com>.
Ilya

Everything sounds very interesting. I am happy to hear about projects
working towards the improvement of ACS. Are there perspectives of this
being donated for the ACS project?

Just curious, the usage metrics view you mentioned is this related to the
work presented by Rohit in [
https://cwiki.apache.org/confluence/display/CLOUDSTACK/Metrics+Views+for+CloudStack+UI
<https://www.google.com/url?q=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FCLOUDSTACK%2FMetrics%2BViews%2Bfor%2BCloudStack%2BUI&sa=D&sntz=1&usg=AFQjCNEr3B8PwmIgXSa-Xl_eH6ENVPmDog>
]?

Regards

2016-04-13 15:06 GMT-03:00 ilya <il...@gmail.com>:

> Gabriel,
>
> In regards to operation issues and SLA(s), there are several initiatives
> that come to mind:
>
> 1) Rewrite and enhance CloudStack HA (being worked on) - the specs are
> posted on Confluence, targeting KVM  primarily, as Xen and VmWare have
> this problem solved
>
> 2) Distributed Resource Scheduler (work begins in 2 months) - similar to
> VmWare DRS, except its less sophisticated and plug-gable. It will move
> the VMs within your cluster to host with least usage, it can be extended
> to shutdown idle hosts and bring them up - when needed, reference
> feature #3 below
>
> 3) IPMI Support (completed) - ability to issue power level commands
> through hosts IPMI interface (ILO,DRAC, etc...)
>
> 4) Usage Metrics View - enhancement to UI to show current usage and hot
> spots in your env.
>
>
>
> On 4/11/16 8:44 AM, Gabriel Beims Bräscher wrote:
> > Dear Apache CloudStack users,
> >
> > I have been discussing with some colleagues about solutions that can help
> > us to manage our Apache CloudStack (ACS) environments. By management I
> > mean, dealing with service level agreements (SLAs), management of idle
> > hosts, the balancing of the environment's virtual machine loads, and the
> > monitoring and tracking of resource usage (not only allocation).
> >
> > Have you guys dealt with those situations and any other that you would
> > consider as a day-to-day management situation? How did you work them out?
> >
> > It would be helpful to hear back your thoughts.
> >
> > Thanks for your time,
> > Gabriel.
> >
>
>

Re: How do you manage/improve your ACS environment?

Posted by ilya <il...@gmail.com>.
Gabriel,

In regards to operation issues and SLA(s), there are several initiatives
that come to mind:

1) Rewrite and enhance CloudStack HA (being worked on) - the specs are
posted on Confluence, targeting KVM  primarily, as Xen and VmWare have
this problem solved

2) Distributed Resource Scheduler (work begins in 2 months) - similar to
VmWare DRS, except its less sophisticated and plug-gable. It will move
the VMs within your cluster to host with least usage, it can be extended
to shutdown idle hosts and bring them up - when needed, reference
feature #3 below

3) IPMI Support (completed) - ability to issue power level commands
through hosts IPMI interface (ILO,DRAC, etc...)

4) Usage Metrics View - enhancement to UI to show current usage and hot
spots in your env.



On 4/11/16 8:44 AM, Gabriel Beims Bräscher wrote:
> Dear Apache CloudStack users,
> 
> I have been discussing with some colleagues about solutions that can help
> us to manage our Apache CloudStack (ACS) environments. By management I
> mean, dealing with service level agreements (SLAs), management of idle
> hosts, the balancing of the environment's virtual machine loads, and the
> monitoring and tracking of resource usage (not only allocation).
> 
> Have you guys dealt with those situations and any other that you would
> consider as a day-to-day management situation? How did you work them out?
> 
> It would be helpful to hear back your thoughts.
> 
> Thanks for your time,
> Gabriel.
>