You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@stratos.apache.org by Asiri Liyana Arachchi <as...@gmail.com> on 2014/08/18 13:23:27 UTC

[gsoc] Improvements to Autoscaling in Apache Stratos

Hi Devs,

*Requirement 1: * Improve Autoscaling to predict the number of instances
required in the next time interval. Currently it predicts the load for next
time interval. Then a threshold is used to decide on scale up or down.
In the current configuration it scales up or down one instance at a time.
[1]

*Implementation:* Number of required instance at the next minute is
calculated based on three factors.

Requests In Flight
Memory Consumption
Load Average

In the *requests in flight* based required instance calculation, a new
approach has been used.
Currently autoscalar expects threshold values like upper limit and lower
limit from the autoscaling policy. In the new implementation it's not
expected and a threshold is calculated using lb stats.

How that threshold is calculated:
A new stat "request served count" is introduced in LB. It keeps the value
of how many requests have been served since the last time the stat events
has been published to CEP.
Using CEP with related execution plan the average number of requests that
an instance can handle stat is being calculated. It's aggregated over a
time window of 10 minutes and a average is calculated and sent to
autoscalar.
Based on that value and the the predicted value of rif by the prevailing
predicting algorithm, the number of instances required is being calculated.

For the *Memory Consumption *and* Load Average *it's not possible to
calculate thresholds for them like in the rif. So the current stats are
being used to calculate required number of instances for the next minute.
Once the required number of instances are calculated autoscalar will scale
up or down instances based on the logic written in rule file.

As suggested when scaling down it's done slowly to reduce the high
variation in spawning and terminating instances.
This configuration is successfully tested. In testing considered most of the
issues which can occur during an actual production environment.

This may look like ,that the instances are spawned a lot with less control
and with rapid rate of spawning and terminating . But actually it doesn't.
Because the predicted values of considering metrics are included in the
equation and the scaling down slowly provides more stability to the
implementation.
It's more responsive than the previous configuration and provides high
availability.

Pull request [2] has been already made.

*Requirement 2: * Predict the load according to a schedule defined by end
user
Seasonal load expectation will be handled by this aspect. [1]

*Implementation Plan : *Define the deployment policy allowing to add
schedule info with attributes for time , maximum and minimum partition
count.

The partition maxes and mins defined in the partition are kept as default
values while the system is operational. Once the scheduled time starts
those values will be replaced with schedule defined values.
Couldn't complete this aspect due to unexpected time amount had to spend
for setting up the stratos development environment. I'll continue the work
and will complete it.

Thank you so much stratos community, for the support offered so far.

Related Mail threads [3]

[1] : https://issues.apache.org/jira/browse/STRATOS-488
[2]: https://github.com/apache/stratos/pull/17
[3] :*https://www.mail-archive.com/dev@stratos.apache.org/msg00077.html
<https://www.mail-archive.com/dev@stratos.apache.org/msg00077.html>*
*http://mail-archives.apache.org/mod_mbox/stratos-dev/201403.mbox/%3CCAFWrs++Z_CZAy7uvAAgHvwRwNKUAOh_vRAwEhP_SndbsjqZapg@mail.gmail.com%3E
<http://mail-archives.apache.org/mod_mbox/stratos-dev/201403.mbox/%3CCAFWrs++Z_CZAy7uvAAgHvwRwNKUAOh_vRAwEhP_SndbsjqZapg@mail.gmail.com%3E>*

Regards,
Asiri

Re: [gsoc] Improvements to Autoscaling in Apache Stratos

Posted by Lahiru Sandaruwan <la...@wso2.com>.

Great work Asiri. We can prepare a demonstration with a public hangout in
near future. We will go through your PR and give a feedback.



On Mon, Aug 18, 2014 at 4:53 PM, Asiri Liyana Arachchi <as...@gmail.com>
wrote:

> Hi Devs,
>
> *Requirement  1: * Improve Autoscaling to predict the number of instances
> required in the next time interval. Currently it predicts the load for next
> time interval. Then a  threshold is used to decide on scale up or down.
> In the current configuration it scales up or down one instance at a time.
> [1]
>
> *Implementation:* Number of required instance at the next minute is
> calculated based on three factors.
>
> Requests In Flight
> Memory Consumption
> Load Average
>
> In the *requests in flight* based required instance calculation, a new
> approach has been used.
> Currently autoscalar expects threshold values like upper limit and lower
> limit from the autoscaling policy. In the new implementation it's not
> expected and a threshold is calculated using lb stats.
>
> How that threshold is calculated:
> A new stat "request served count" is introduced in LB. It keeps the value
> of how many requests have been served since the last time the stat events
> has been published to CEP.
> Using CEP with related execution plan the average number of requests that
> an instance can handle stat is being calculated. It's aggregated over a
> time window of 10 minutes and a average is calculated and sent to
> autoscalar.
> Based on that value and the the predicted value of rif by the prevailing
> predicting algorithm, the number of instances required is being calculated.
>
> For the *Memory Consumption *and* Load Average *it's not possible to
> calculate thresholds for them like in the rif. So the current stats are
> being used to calculate required number of instances for  the next minute.
> Once the required number of instances are calculated autoscalar will scale
> up or down instances based on the logic written in rule file.
>
> As suggested when scaling down it's done slowly to reduce the high
> variation  in spawning and terminating instances.
> This configuration is successfully tested. In testing considered most of
> the issues which can occur during an actual production environment.
>
> This may look like ,that the  instances are spawned a lot with less
> control and with rapid rate of spawning and terminating . But actually it
> doesn't. Because the predicted values of considering metrics are included
> in the equation and the scaling down slowly provides more stability to the
> implementation.
> It's more responsive than the previous configuration and provides high
> availability.
>
> Pull request [2] has been already made.
>
>
> *Requirement  2: * Predict the load according to a schedule defined by
> end user
> Seasonal load expectation will be handled by this aspect. [1]
>
> *Implementation Plan : *Define the deployment  policy allowing to add
> schedule info with attributes for time , maximum and minimum partition
> count.
>
>
> 
>
>
> The partition maxes and mins defined in the partition are kept as default
> values while the system is operational. Once the scheduled time starts
> those values will be replaced with schedule defined values.
> Couldn't complete this aspect due to unexpected time amount had to spend
> for setting up the stratos development environment. I'll continue the work
> and will complete it.
>

Looks like a good plan.

Thanks.

>
>
>
> Thank you so much  stratos community, for the support offered so far.
>
>
>
>
> Related Mail threads [3]
>
> [1] : https://issues.apache.org/jira/browse/STRATOS-488
> [2]:  https://github.com/apache/stratos/pull/17
> [3] :*https://www.mail-archive.com/dev@stratos.apache.org/msg00077.html
> <https://www.mail-archive.com/dev@stratos.apache.org/msg00077.html>*
>       *http://mail-archives.apache.org/mod_mbox/stratos-dev/201403.mbox/%3CCAFWrs++Z_CZAy7uvAAgHvwRwNKUAOh_vRAwEhP_SndbsjqZapg@mail.gmail.com%3E
> <http://mail-archives.apache.org/mod_mbox/stratos-dev/201403.mbox/%3CCAFWrs++Z_CZAy7uvAAgHvwRwNKUAOh_vRAwEhP_SndbsjqZapg@mail.gmail.com%3E>*
>
>
> Regards,
> Asiri
>



-- 
--
Lahiru Sandaruwan
Committer and PMC member, Apache Stratos,
Senior Software Engineer,
WSO2 Inc., http://wso2.com
lean.enterprise.middleware

email: lahirus@wso2.com cell: (+94) 773 325 954
blog: http://lahiruwrites.blogspot.com/
twitter: http://twitter.com/lahirus
linked-in: http://lk.linkedin.com/pub/lahiru-sandaruwan/16/153/146