You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@stratos.apache.org by "Michael Hall (michaha2)" <mi...@cisco.com> on 2015/02/12 17:03:36 UTC

autoscale architecture

Hi Devs,

Is there a resource or contact that can help me understand the current, and planned architecture of the autoscaling feature within Stratos.

Best Regards,

Mike

Re: autoscale architecture

Posted by "Michael Hall (michaha2)" <mi...@cisco.com>.

That’s a good plan,

My work number is +442088242650

I’m around now, but will break for a while for lunch in an hour or so.

Cheers

From: Lakmal Warusawithana <la...@wso2.com>>
Date: Friday, 13 February 2015 11:24
To: "dev@stratos.apache.org<ma...@stratos.apache.org>" <de...@stratos.apache.org>>, Imesh Gunaratne <im...@wso2.com>>, Michael Hall <mi...@cisco.com>>
Subject: Re: autoscale architecture

Shall we go for a call, it will be more productive.

On Fri, Feb 13, 2015 at 4:45 PM, Lakmal Warusawithana <la...@wso2.com>> wrote:
Hi Michael

On Fri, Feb 13, 2015 at 4:14 PM, Michael Hall (michaha2) <mi...@cisco.com>> wrote:
Hi Imesh,

So ‘transistion compensated’ refers to cartridges, which are ’transistioning’ between SPAWNED-ACTIVE, and TERMINATING-TERMINATED.

What it really means, is that if the 'aggregated average’ (Referred to this as <metric>PredictedValue in scaling.drl) is compensated:

  1.  As if the ‘spawning’ cartridges are providing resouce (although they aren’t yet)
  2.  As if the ‘terminating’ cartridges have removed resource (although they haven't yet)

Such that the ‘transition compensated aggregated average', will be approximately what the actually aggregated average would be if those cartridges had become fully ‘active’ or ‘terminated’. This means the ‘transition compensated aggregated average’ is always in a sensible state to make a scaling decision.

This then allows us to make a scaling decision as often as we’d like (much smaller than 90 seconds, could even be every 1 second), because if you take the example the we’ve scaled up, the 'transition compensated aggregated average’ will instantly adjust to N/N+1 of it’s raw value (copied formula from previous email for reference below), so another scaling decision will only occur, if the underlying load (aggregated average) increases even further.

transistion-compensated-agg-ave = agg-ave * ( cluster-size / cluster-size +  cluster-spawned-size - cluster–terminating-size )

 I think this is good proposal, definitely it will help to calculate more accurate agg-ave values. Since CEP has the topology information we can easily calculate this.

AFAIK, auto scaler take care of cartridge states when calculating required instances count for a predicted load.

I’d be more than happy to setup a webex meeting to try and explain this better? Or another avenue of communication at your preference?

Kind regards,

Mike

From: Imesh Gunaratne <im...@apache.org>>
Reply-To: "dev@stratos.apache.org<ma...@stratos.apache.org>" <de...@stratos.apache.org>>
Date: Friday, 13 February 2015 01:09

To: dev <de...@stratos.apache.org>>
Subject: Re: autoscale architecture

Hi Mike,

Thanks for the detailed explanation of your question. Currently we do not have the capability to do this in runtime for a specific cartridge. However we could reduce the global scaling decision interval. This needs to be configured at three locations:

1. Cartridge agent statistics publishing interval (default: 15 seconds)
2. CEP execution plan/faulty member detection interval (default: 1 min)
3. Autoscaler cluster monitor interval (default: 90 seconds)

I did not clearly get what you mean by 'transition compensated'. Is there a way to explain it further?

Thanks

On Fri, Feb 13, 2015 at 12:26 AM, Michael Hall (michaha2) <mi...@cisco.com>> wrote:
Hi Dev,

Thanks for your response Imesh, if its ok, I’d like to skip straight to my (rather lengthy) question:

Does the autoscaler have, currently or plans to introduce, a means to receive an asynchronous event, signalling that a cartridge has gone from ‘SPAWNED’ to ‘ACTIVE’, after it is launched from a 'scale-up’ decision, so that, scaling decision interval can decrease to approximately the metric update interval, and multiple cartridges are not spawned when only one is needed?

In more depth:

The reasons for my question being that by knowing a cartridge is in the ‘SPAWNED’ or ’TERMINATING’ state, the aggregated metric averages can be ’transition compensated’ I.e…
transistion-compensated-agg-ave = agg-ave * ( cluster-size / cluster-size +  cluster-spawned-size - cluster–terminating-size )
To allow the scaling decisions to occur on a continuous (only throttled by the metric update frequency) basis.

It appears that currently scaling decision occurs ~minutes. If this becomes ~seconds, it would vastly improving the maximum rate of ascent a cluster can scale against sudden increase in load.

It appears that there is no spawning state awareness, which also means several ‘redundant’ instances get spawned, when instance startup time is greater than the scale decision interval.

Finally:

Are there difficulties in tracking ‘SPAWNED’ to ‘ACTIVE’ state on a per cartridge basis, how does this align (if its a valid enhancement) with other potential improvements that could be made to the autoscaler?

Regards,

Mike

From: Imesh Gunaratne <im...@apache.org>>
Reply-To: "dev@stratos.apache.org<ma...@stratos.apache.org>" <de...@stratos.apache.org>>
Date: Thursday, 12 February 2015 18:16
To: dev <de...@stratos.apache.org>>
Subject: Re: autoscale architecture

Hi Michael,

Yes you can ask any questions you have on Autoscaling here.

I don't think we have documented Autoscaling feature in 4.1.0 at the moment. However you could find some information here [1]. Autoscaling has slightly changed with Composite Application Model.

[1] https://cwiki.apache.org/confluence/display/STRATOS/4.1.0+Autoscaler

Thanks

On Thu, Feb 12, 2015 at 9:33 PM, Michael Hall (michaha2) <mi...@cisco.com>> wrote:
Hi Devs,

Is there a resource or contact that can help me understand the current, and planned architecture of the autoscaling feature within Stratos.

Best Regards,

Mike

--
Imesh Gunaratne

Technical Lead, WSO2
Committer & PMC Member, Apache Stratos

--
Imesh Gunaratne

Technical Lead, WSO2
Committer & PMC Member, Apache Stratos

--
Lakmal Warusawithana
Vice President, Apache Stratos
Director - Cloud Architecture; WSO2 Inc.
Mobile : +94714289692<tel:%2B94714289692>
Blog : http://lakmalsview.blogspot.com/

--
Lakmal Warusawithana
Vice President, Apache Stratos
Director - Cloud Architecture; WSO2 Inc.
Mobile : +94714289692
Blog : http://lakmalsview.blogspot.com/

Re: autoscale architecture

Posted by Lakmal Warusawithana <la...@wso2.com>.

Shall we go for a call, it will be more productive.

On Fri, Feb 13, 2015 at 4:45 PM, Lakmal Warusawithana <la...@wso2.com>
wrote:

> Hi Michael
>
> On Fri, Feb 13, 2015 at 4:14 PM, Michael Hall (michaha2) <
> michaha2@cisco.com> wrote:
>
>>  Hi Imesh,
>>
>>  So ‘transistion compensated’ refers to cartridges, which are
>> ’transistioning’ between SPAWNED-ACTIVE, and TERMINATING-TERMINATED.
>>
>>  What it really means, is that if the 'aggregated average’ (Referred to
>> this as <metric>PredictedValue in scaling.drl) is compensated:
>>
>>    1. As if the ‘spawning’ cartridges are providing resouce (although
>>    they aren’t yet)
>>    2. As if the ‘terminating’ cartridges have removed resource (although
>>    they haven't yet)
>>
>> Such that the ‘transition compensated aggregated average', will be
>> approximately what the actually aggregated average would be if those
>> cartridges had become fully ‘active’ or ‘terminated’. This means the
>> ‘transition compensated aggregated average’ is always in a sensible state
>> to make a scaling decision.
>>
>>  This then allows us to make a scaling decision as often as we’d like
>> (much smaller than 90 seconds, could even be every 1 second), because if
>> you take the example the we’ve scaled up, the 'transition compensated
>> aggregated average’ will instantly adjust to N/N+1 of it’s raw value
>> (copied formula from previous email for reference below), so another
>> scaling decision will only occur, if the underlying load (aggregated
>> average) increases even further.
>>
>>  *transistion-compensated-agg-ave = agg-ave * ( cluster-size /
>> cluster-size +  cluster-spawned-size - cluster–terminating-size )*
>>
>>
>  I think this is good proposal, definitely it will help to calculate more
> accurate agg-ave values. Since CEP has the topology information we can
> easily calculate this.
>
> AFAIK, auto scaler take care of cartridge states when calculating required
> instances count for a predicted load.
>
>
>
>>  I’d be more than happy to setup a webex meeting to try and explain this
>> better? Or another avenue of communication at your preference?
>>
>>  Kind regards,
>>
>>  Mike
>>
>>   From: Imesh Gunaratne <im...@apache.org>
>> Reply-To: "dev@stratos.apache.org" <de...@stratos.apache.org>
>> Date: Friday, 13 February 2015 01:09
>>
>> To: dev <de...@stratos.apache.org>
>> Subject: Re: autoscale architecture
>>
>>   Hi Mike,
>>
>>  Thanks for the detailed explanation of your question. Currently we do
>> not have the capability to do this in runtime for a specific cartridge.
>> However we could reduce the global scaling decision interval. This needs to
>> be configured at three locations:
>>
>>  1. Cartridge agent statistics publishing interval (default: 15 seconds)
>> 2. CEP execution plan/faulty member detection interval (default: 1 min)
>> 3. Autoscaler cluster monitor interval (default: 90 seconds)
>>
>>  I did not clearly get what you mean by 'transition compensated'. Is
>> there a way to explain it further?
>>
>>  Thanks
>>
>>
>> On Fri, Feb 13, 2015 at 12:26 AM, Michael Hall (michaha2) <
>> michaha2@cisco.com> wrote:
>>
>>>  Hi Dev,
>>>
>>>  Thanks for your response Imesh, if its ok, I’d like to skip straight
>>> to my (rather lengthy) question:
>>>
>>>  Does the autoscaler have, currently or plans to introduce, a means to
>>> receive an asynchronous event, signalling that a cartridge has gone from
>>> ‘SPAWNED’ to ‘ACTIVE’, after it is launched from a 'scale-up’ decision, so
>>> that, scaling decision interval can decrease to approximately the metric
>>> update interval, and multiple cartridges are not spawned when only one is
>>> needed?
>>>
>>>  In more depth:
>>>
>>>  The reasons for my question being that by knowing a cartridge is in
>>> the ‘SPAWNED’ or ’TERMINATING’ state, the aggregated metric averages can be
>>> ’transition compensated’ I.e…
>>> *transistion-compensated-agg-ave = agg-ave * ( cluster-size /
>>> cluster-size +  cluster-spawned-size - cluster–terminating-size )*
>>> To allow the scaling decisions to occur on a continuous (only throttled
>>> by the metric update frequency) basis.
>>>
>>>  It appears that currently scaling decision occurs ~minutes. If this
>>> becomes ~seconds, it would vastly improving the maximum rate of ascent a
>>> cluster can scale against sudden increase in load.
>>>
>>>  It appears that there is no spawning state awareness, which also means
>>> several ‘redundant’ instances get spawned, when instance startup time is
>>> greater than the scale decision interval.
>>>
>>>  Finally:
>>>
>>>  Are there difficulties in tracking ‘SPAWNED’ to ‘ACTIVE’ state on a
>>> per cartridge basis, how does this align (if its a valid enhancement) with
>>> other potential improvements that could be made to the autoscaler?
>>>
>>>  Regards,
>>>
>>>  Mike
>>>
>>>   From: Imesh Gunaratne <im...@apache.org>
>>> Reply-To: "dev@stratos.apache.org" <de...@stratos.apache.org>
>>> Date: Thursday, 12 February 2015 18:16
>>> To: dev <de...@stratos.apache.org>
>>> Subject: Re: autoscale architecture
>>>
>>>   Hi Michael,
>>>
>>>  Yes you can ask any questions you have on Autoscaling here.
>>>
>>>  I don't think we have documented Autoscaling feature in 4.1.0 at the
>>> moment. However you could find some information here [1]. Autoscaling has
>>> slightly changed with Composite Application Model.
>>>
>>>  [1]
>>> https://cwiki.apache.org/confluence/display/STRATOS/4.1.0+Autoscaler
>>>
>>>  Thanks
>>>
>>> On Thu, Feb 12, 2015 at 9:33 PM, Michael Hall (michaha2) <
>>> michaha2@cisco.com> wrote:
>>>
>>>>  Hi Devs,
>>>>
>>>>  Is there a resource or contact that can help me understand the
>>>> current, and planned architecture of the autoscaling feature within Stratos.
>>>>
>>>>  Best Regards,
>>>>
>>>>  Mike
>>>>
>>>
>>>
>>>
>>>  --
>>>  Imesh Gunaratne
>>>
>>> Technical Lead, WSO2
>>> Committer & PMC Member, Apache Stratos
>>>
>>
>>
>>
>>  --
>>  Imesh Gunaratne
>>
>> Technical Lead, WSO2
>> Committer & PMC Member, Apache Stratos
>>
>
>
>
> --
> Lakmal Warusawithana
> Vice President, Apache Stratos
> Director - Cloud Architecture; WSO2 Inc.
> Mobile : +94714289692
> Blog : http://lakmalsview.blogspot.com/
>
>


-- 
Lakmal Warusawithana
Vice President, Apache Stratos
Director - Cloud Architecture; WSO2 Inc.
Mobile : +94714289692
Blog : http://lakmalsview.blogspot.com/

Re: autoscale architecture

Posted by Lakmal Warusawithana <la...@wso2.com>.

Hi Michael

On Fri, Feb 13, 2015 at 4:14 PM, Michael Hall (michaha2) <michaha2@cisco.com
> wrote:

>  Hi Imesh,
>
>  So ‘transistion compensated’ refers to cartridges, which are
> ’transistioning’ between SPAWNED-ACTIVE, and TERMINATING-TERMINATED.
>
>  What it really means, is that if the 'aggregated average’ (Referred to
> this as <metric>PredictedValue in scaling.drl) is compensated:
>
>    1. As if the ‘spawning’ cartridges are providing resouce (although
>    they aren’t yet)
>    2. As if the ‘terminating’ cartridges have removed resource (although
>    they haven't yet)
>
> Such that the ‘transition compensated aggregated average', will be
> approximately what the actually aggregated average would be if those
> cartridges had become fully ‘active’ or ‘terminated’. This means the
> ‘transition compensated aggregated average’ is always in a sensible state
> to make a scaling decision.
>
>  This then allows us to make a scaling decision as often as we’d like
> (much smaller than 90 seconds, could even be every 1 second), because if
> you take the example the we’ve scaled up, the 'transition compensated
> aggregated average’ will instantly adjust to N/N+1 of it’s raw value
> (copied formula from previous email for reference below), so another
> scaling decision will only occur, if the underlying load (aggregated
> average) increases even further.
>
>  *transistion-compensated-agg-ave = agg-ave * ( cluster-size /
> cluster-size +  cluster-spawned-size - cluster–terminating-size )*
>
>
 I think this is good proposal, definitely it will help to calculate more
accurate agg-ave values. Since CEP has the topology information we can
easily calculate this.

AFAIK, auto scaler take care of cartridge states when calculating required
instances count for a predicted load.



>  I’d be more than happy to setup a webex meeting to try and explain this
> better? Or another avenue of communication at your preference?
>
>  Kind regards,
>
>  Mike
>
>   From: Imesh Gunaratne <im...@apache.org>
> Reply-To: "dev@stratos.apache.org" <de...@stratos.apache.org>
> Date: Friday, 13 February 2015 01:09
>
> To: dev <de...@stratos.apache.org>
> Subject: Re: autoscale architecture
>
>   Hi Mike,
>
>  Thanks for the detailed explanation of your question. Currently we do
> not have the capability to do this in runtime for a specific cartridge.
> However we could reduce the global scaling decision interval. This needs to
> be configured at three locations:
>
>  1. Cartridge agent statistics publishing interval (default: 15 seconds)
> 2. CEP execution plan/faulty member detection interval (default: 1 min)
> 3. Autoscaler cluster monitor interval (default: 90 seconds)
>
>  I did not clearly get what you mean by 'transition compensated'. Is
> there a way to explain it further?
>
>  Thanks
>
>
> On Fri, Feb 13, 2015 at 12:26 AM, Michael Hall (michaha2) <
> michaha2@cisco.com> wrote:
>
>>  Hi Dev,
>>
>>  Thanks for your response Imesh, if its ok, I’d like to skip straight to
>> my (rather lengthy) question:
>>
>>  Does the autoscaler have, currently or plans to introduce, a means to
>> receive an asynchronous event, signalling that a cartridge has gone from
>> ‘SPAWNED’ to ‘ACTIVE’, after it is launched from a 'scale-up’ decision, so
>> that, scaling decision interval can decrease to approximately the metric
>> update interval, and multiple cartridges are not spawned when only one is
>> needed?
>>
>>  In more depth:
>>
>>  The reasons for my question being that by knowing a cartridge is in the
>> ‘SPAWNED’ or ’TERMINATING’ state, the aggregated metric averages can be
>> ’transition compensated’ I.e…
>> *transistion-compensated-agg-ave = agg-ave * ( cluster-size /
>> cluster-size +  cluster-spawned-size - cluster–terminating-size )*
>> To allow the scaling decisions to occur on a continuous (only throttled
>> by the metric update frequency) basis.
>>
>>  It appears that currently scaling decision occurs ~minutes. If this
>> becomes ~seconds, it would vastly improving the maximum rate of ascent a
>> cluster can scale against sudden increase in load.
>>
>>  It appears that there is no spawning state awareness, which also means
>> several ‘redundant’ instances get spawned, when instance startup time is
>> greater than the scale decision interval.
>>
>>  Finally:
>>
>>  Are there difficulties in tracking ‘SPAWNED’ to ‘ACTIVE’ state on a per
>> cartridge basis, how does this align (if its a valid enhancement) with
>> other potential improvements that could be made to the autoscaler?
>>
>>  Regards,
>>
>>  Mike
>>
>>   From: Imesh Gunaratne <im...@apache.org>
>> Reply-To: "dev@stratos.apache.org" <de...@stratos.apache.org>
>> Date: Thursday, 12 February 2015 18:16
>> To: dev <de...@stratos.apache.org>
>> Subject: Re: autoscale architecture
>>
>>   Hi Michael,
>>
>>  Yes you can ask any questions you have on Autoscaling here.
>>
>>  I don't think we have documented Autoscaling feature in 4.1.0 at the
>> moment. However you could find some information here [1]. Autoscaling has
>> slightly changed with Composite Application Model.
>>
>>  [1] https://cwiki.apache.org/confluence/display/STRATOS/4.1.0+Autoscaler
>>
>>  Thanks
>>
>> On Thu, Feb 12, 2015 at 9:33 PM, Michael Hall (michaha2) <
>> michaha2@cisco.com> wrote:
>>
>>>  Hi Devs,
>>>
>>>  Is there a resource or contact that can help me understand the
>>> current, and planned architecture of the autoscaling feature within Stratos.
>>>
>>>  Best Regards,
>>>
>>>  Mike
>>>
>>
>>
>>
>>  --
>>  Imesh Gunaratne
>>
>> Technical Lead, WSO2
>> Committer & PMC Member, Apache Stratos
>>
>
>
>
>  --
>  Imesh Gunaratne
>
> Technical Lead, WSO2
> Committer & PMC Member, Apache Stratos
>



-- 
Lakmal Warusawithana
Vice President, Apache Stratos
Director - Cloud Architecture; WSO2 Inc.
Mobile : +94714289692
Blog : http://lakmalsview.blogspot.com/

Re: autoscale architecture

Posted by "Michael Hall (michaha2)" <mi...@cisco.com>.

Hi Imesh,

So ‘transistion compensated’ refers to cartridges, which are ’transistioning’ between SPAWNED-ACTIVE, and TERMINATING-TERMINATED.

What it really means, is that if the 'aggregated average’ (Referred to this as <metric>PredictedValue in scaling.drl) is compensated:

  1.  As if the ‘spawning’ cartridges are providing resouce (although they aren’t yet)
  2.  As if the ‘terminating’ cartridges have removed resource (although they haven't yet)

Such that the ‘transition compensated aggregated average', will be approximately what the actually aggregated average would be if those cartridges had become fully ‘active’ or ‘terminated’. This means the ‘transition compensated aggregated average’ is always in a sensible state to make a scaling decision.

This then allows us to make a scaling decision as often as we’d like (much smaller than 90 seconds, could even be every 1 second), because if you take the example the we’ve scaled up, the 'transition compensated aggregated average’ will instantly adjust to N/N+1 of it’s raw value (copied formula from previous email for reference below), so another scaling decision will only occur, if the underlying load (aggregated average) increases even further.

transistion-compensated-agg-ave = agg-ave * ( cluster-size / cluster-size +  cluster-spawned-size - cluster–terminating-size )

I’d be more than happy to setup a webex meeting to try and explain this better? Or another avenue of communication at your preference?

Kind regards,

Mike

From: Imesh Gunaratne <im...@apache.org>>
Reply-To: "dev@stratos.apache.org<ma...@stratos.apache.org>" <de...@stratos.apache.org>>
Date: Friday, 13 February 2015 01:09
To: dev <de...@stratos.apache.org>>
Subject: Re: autoscale architecture

Hi Mike,

Thanks for the detailed explanation of your question. Currently we do not have the capability to do this in runtime for a specific cartridge. However we could reduce the global scaling decision interval. This needs to be configured at three locations:

1. Cartridge agent statistics publishing interval (default: 15 seconds)
2. CEP execution plan/faulty member detection interval (default: 1 min)
3. Autoscaler cluster monitor interval (default: 90 seconds)

I did not clearly get what you mean by 'transition compensated'. Is there a way to explain it further?

Thanks


On Fri, Feb 13, 2015 at 12:26 AM, Michael Hall (michaha2) <mi...@cisco.com>> wrote:
Hi Dev,

Thanks for your response Imesh, if its ok, I’d like to skip straight to my (rather lengthy) question:

Does the autoscaler have, currently or plans to introduce, a means to receive an asynchronous event, signalling that a cartridge has gone from ‘SPAWNED’ to ‘ACTIVE’, after it is launched from a 'scale-up’ decision, so that, scaling decision interval can decrease to approximately the metric update interval, and multiple cartridges are not spawned when only one is needed?

In more depth:

The reasons for my question being that by knowing a cartridge is in the ‘SPAWNED’ or ’TERMINATING’ state, the aggregated metric averages can be ’transition compensated’ I.e…
transistion-compensated-agg-ave = agg-ave * ( cluster-size / cluster-size +  cluster-spawned-size - cluster–terminating-size )
To allow the scaling decisions to occur on a continuous (only throttled by the metric update frequency) basis.

It appears that currently scaling decision occurs ~minutes. If this becomes ~seconds, it would vastly improving the maximum rate of ascent a cluster can scale against sudden increase in load.

It appears that there is no spawning state awareness, which also means several ‘redundant’ instances get spawned, when instance startup time is greater than the scale decision interval.

Finally:

Are there difficulties in tracking ‘SPAWNED’ to ‘ACTIVE’ state on a per cartridge basis, how does this align (if its a valid enhancement) with other potential improvements that could be made to the autoscaler?

Regards,

Mike

From: Imesh Gunaratne <im...@apache.org>>
Reply-To: "dev@stratos.apache.org<ma...@stratos.apache.org>" <de...@stratos.apache.org>>
Date: Thursday, 12 February 2015 18:16
To: dev <de...@stratos.apache.org>>
Subject: Re: autoscale architecture

Hi Michael,

Yes you can ask any questions you have on Autoscaling here.

I don't think we have documented Autoscaling feature in 4.1.0 at the moment. However you could find some information here [1]. Autoscaling has slightly changed with Composite Application Model.

[1] https://cwiki.apache.org/confluence/display/STRATOS/4.1.0+Autoscaler

Thanks

On Thu, Feb 12, 2015 at 9:33 PM, Michael Hall (michaha2) <mi...@cisco.com>> wrote:
Hi Devs,

Is there a resource or contact that can help me understand the current, and planned architecture of the autoscaling feature within Stratos.

Best Regards,

Mike



--
Imesh Gunaratne

Technical Lead, WSO2
Committer & PMC Member, Apache Stratos



--
Imesh Gunaratne

Technical Lead, WSO2
Committer & PMC Member, Apache Stratos

Re: autoscale architecture

Posted by Imesh Gunaratne <im...@apache.org>.

Hi Mike,

Thanks for the detailed explanation of your question. Currently we do not
have the capability to do this in runtime for a specific cartridge. However
we could reduce the global scaling decision interval. This needs to be
configured at three locations:

1. Cartridge agent statistics publishing interval (default: 15 seconds)
2. CEP execution plan/faulty member detection interval (default: 1 min)
3. Autoscaler cluster monitor interval (default: 90 seconds)

I did not clearly get what you mean by 'transition compensated'. Is there a
way to explain it further?

Thanks


On Fri, Feb 13, 2015 at 12:26 AM, Michael Hall (michaha2) <
michaha2@cisco.com> wrote:

>  Hi Dev,
>
>  Thanks for your response Imesh, if its ok, I’d like to skip straight to
> my (rather lengthy) question:
>
>  Does the autoscaler have, currently or plans to introduce, a means to
> receive an asynchronous event, signalling that a cartridge has gone from
> ‘SPAWNED’ to ‘ACTIVE’, after it is launched from a 'scale-up’ decision, so
> that, scaling decision interval can decrease to approximately the metric
> update interval, and multiple cartridges are not spawned when only one is
> needed?
>
>  In more depth:
>
>  The reasons for my question being that by knowing a cartridge is in the
> ‘SPAWNED’ or ’TERMINATING’ state, the aggregated metric averages can be
> ’transition compensated’ I.e…
> *transistion-compensated-agg-ave = agg-ave * ( cluster-size / cluster-size
> +  cluster-spawned-size - cluster–terminating-size )*
>  To allow the scaling decisions to occur on a continuous (only throttled
> by the metric update frequency) basis.
>
>  It appears that currently scaling decision occurs ~minutes. If this
> becomes ~seconds, it would vastly improving the maximum rate of ascent a
> cluster can scale against sudden increase in load.
>
>  It appears that there is no spawning state awareness, which also means
> several ‘redundant’ instances get spawned, when instance startup time is
> greater than the scale decision interval.
>
>  Finally:
>
>  Are there difficulties in tracking ‘SPAWNED’ to ‘ACTIVE’ state on a per
> cartridge basis, how does this align (if its a valid enhancement) with
> other potential improvements that could be made to the autoscaler?
>
>  Regards,
>
>  Mike
>
>   From: Imesh Gunaratne <im...@apache.org>
> Reply-To: "dev@stratos.apache.org" <de...@stratos.apache.org>
> Date: Thursday, 12 February 2015 18:16
> To: dev <de...@stratos.apache.org>
> Subject: Re: autoscale architecture
>
>   Hi Michael,
>
>  Yes you can ask any questions you have on Autoscaling here.
>
>  I don't think we have documented Autoscaling feature in 4.1.0 at the
> moment. However you could find some information here [1]. Autoscaling has
> slightly changed with Composite Application Model.
>
>  [1] https://cwiki.apache.org/confluence/display/STRATOS/4.1.0+Autoscaler
>
>  Thanks
>
> On Thu, Feb 12, 2015 at 9:33 PM, Michael Hall (michaha2) <
> michaha2@cisco.com> wrote:
>
>>  Hi Devs,
>>
>>  Is there a resource or contact that can help me understand the current,
>> and planned architecture of the autoscaling feature within Stratos.
>>
>>  Best Regards,
>>
>>  Mike
>>
>
>
>
>  --
>  Imesh Gunaratne
>
> Technical Lead, WSO2
> Committer & PMC Member, Apache Stratos
>



-- 
Imesh Gunaratne

Technical Lead, WSO2
Committer & PMC Member, Apache Stratos

Re: autoscale architecture

Posted by "Michael Hall (michaha2)" <mi...@cisco.com>.

Hi Dev,

Thanks for your response Imesh, if its ok, I’d like to skip straight to my (rather lengthy) question:

Does the autoscaler have, currently or plans to introduce, a means to receive an asynchronous event, signalling that a cartridge has gone from ‘SPAWNED’ to ‘ACTIVE’, after it is launched from a 'scale-up’ decision, so that, scaling decision interval can decrease to approximately the metric update interval, and multiple cartridges are not spawned when only one is needed?

In more depth:

The reasons for my question being that by knowing a cartridge is in the ‘SPAWNED’ or ’TERMINATING’ state, the aggregated metric averages can be ’transition compensated’ I.e…
transistion-compensated-agg-ave = agg-ave * ( cluster-size / cluster-size +  cluster-spawned-size - cluster–terminating-size )
To allow the scaling decisions to occur on a continuous (only throttled by the metric update frequency) basis.

It appears that currently scaling decision occurs ~minutes. If this becomes ~seconds, it would vastly improving the maximum rate of ascent a cluster can scale against sudden increase in load.

It appears that there is no spawning state awareness, which also means several ‘redundant’ instances get spawned, when instance startup time is greater than the scale decision interval.

Finally:

Are there difficulties in tracking ‘SPAWNED’ to ‘ACTIVE’ state on a per cartridge basis, how does this align (if its a valid enhancement) with other potential improvements that could be made to the autoscaler?

Regards,

Mike

From: Imesh Gunaratne <im...@apache.org>>
Reply-To: "dev@stratos.apache.org<ma...@stratos.apache.org>" <de...@stratos.apache.org>>
Date: Thursday, 12 February 2015 18:16
To: dev <de...@stratos.apache.org>>
Subject: Re: autoscale architecture

Hi Michael,

Yes you can ask any questions you have on Autoscaling here.

I don't think we have documented Autoscaling feature in 4.1.0 at the moment. However you could find some information here [1]. Autoscaling has slightly changed with Composite Application Model.

[1] https://cwiki.apache.org/confluence/display/STRATOS/4.1.0+Autoscaler

Thanks

On Thu, Feb 12, 2015 at 9:33 PM, Michael Hall (michaha2) <mi...@cisco.com>> wrote:
Hi Devs,

Is there a resource or contact that can help me understand the current, and planned architecture of the autoscaling feature within Stratos.

Best Regards,

Mike



--
Imesh Gunaratne

Technical Lead, WSO2
Committer & PMC Member, Apache Stratos

Re: autoscale architecture

Posted by Imesh Gunaratne <im...@apache.org>.

Hi Michael,

Yes you can ask any questions you have on Autoscaling here.

I don't think we have documented Autoscaling feature in 4.1.0 at the
moment. However you could find some information here [1]. Autoscaling has
slightly changed with Composite Application Model.

[1] https://cwiki.apache.org/confluence/display/STRATOS/4.1.0+Autoscaler

Thanks

On Thu, Feb 12, 2015 at 9:33 PM, Michael Hall (michaha2) <michaha2@cisco.com
> wrote:

>  Hi Devs,
>
>  Is there a resource or contact that can help me understand the current,
> and planned architecture of the autoscaling feature within Stratos.
>
>  Best Regards,
>
>  Mike
>

-- 
Imesh Gunaratne

Technical Lead, WSO2
Committer & PMC Member, Apache Stratos