You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@stratos.apache.org by Lahiru Sandaruwan <la...@wso2.com> on 2013/10/29 11:22:09 UTC
Fault handling scenarios for Stratos cartridge instances
Hi all,
We(Imesh, Reka, and myself) had a small discussion on $subject while
working on Stratos 4.0 M1.
This is on handling faults in VM instances. For example there can be three
basic faults.
- Network Issue
- Application process is terminated
- VM itself is terminated
Here is the decision table,
Process
VM
Decision flow
Down
Up
-
Cartridge agent publish event to CC
-
CC updates instance status in topology
-
Autoscaler decides to kill it
Down
Down(It can be that agent is crashed)
-
CEP identify that & publish event to Autoscaler
-
Autoscaler calls CC to terminate(if available) and remove the instance
from topology
-
Autoscaler will spawn another to cover that
Up
Up(but network issue)
-
CEP sends statistics on fault requests to Autoscaler
-
Autoscaler keep monitoring it and takes a decision to terminate the
instance
-
Autoscaler will spawn another to cover that
Feed your thoughts here...
Thanks.
--
--
Lahiru Sandaruwan
Software Engineer,
Platform Technologies,
WSO2 Inc., http://wso2.com
lean.enterprise.middleware
email: lahirus@wso2.com cell: (+94) 773 325 954
blog: http://lahiruwrites.blogspot.com/
twitter: http://twitter.com/lahirus
linked-in: http://lk.linkedin.com/pub/lahiru-sandaruwan/16/153/146
Re: Fault handling scenarios for Stratos cartridge instances
Posted by Udara Liyanage <ud...@wso2.com>.
Does n't Auto scaler spawn a new instance in the first case of the above
fault table.?
On Thu, Nov 21, 2013 at 1:16 AM, Lahiru Sandaruwan <la...@wso2.com> wrote:
> Hi Manula and Udara,
>
> Please update Jiras with the progress on this, at [1] and sub issues.
>
> Thanks.
> [1] https://issues.apache.org/jira/browse/STRATOS-144
>
>
> On Tue, Oct 29, 2013 at 4:35 PM, Nirmal Fernando <ni...@gmail.com>wrote:
>
>> +1 looks ok. We need to do OS level monitoring of the agent and keep it
>> alive (to avoid process up, agent down, VM up scenario). That's an easy
>> task.
>>
>>
>> On Tue, Oct 29, 2013 at 3:52 PM, Lahiru Sandaruwan <la...@wso2.com>wrote:
>>
>>> Hi all,
>>>
>>> We(Imesh, Reka, and myself) had a small discussion on $subject while
>>> working on Stratos 4.0 M1.
>>>
>>> This is on handling faults in VM instances. For example there can be
>>> three basic faults.
>>>
>>> - Network Issue
>>> - Application process is terminated
>>> - VM itself is terminated
>>>
>>> Here is the decision table,
>>>
>>>
>>> Process
>>>
>>> VM
>>>
>>> Decision flow
>>>
>>> Down
>>>
>>> Up
>>>
>>> -
>>>
>>> Cartridge agent publish event to CC
>>> -
>>>
>>> CC updates instance status in topology
>>> -
>>>
>>> Autoscaler decides to kill it
>>>
>>> Down
>>>
>>> Down(It can be that agent is crashed)
>>>
>>> -
>>>
>>> CEP identify that & publish event to Autoscaler
>>> -
>>>
>>> Autoscaler calls CC to terminate(if available) and remove the
>>> instance from topology
>>> -
>>>
>>> Autoscaler will spawn another to cover that
>>>
>>> Up
>>>
>>> Up(but network issue)
>>>
>>> -
>>>
>>> CEP sends statistics on fault requests to Autoscaler
>>> -
>>>
>>> Autoscaler keep monitoring it and takes a decision to terminate the
>>> instance
>>> -
>>>
>>> Autoscaler will spawn another to cover that
>>>
>>>
>>>
>>>
>>> Feed your thoughts here...
>>>
>>> Thanks.
>>>
>>>
>>>
>>> --
>>> --
>>> Lahiru Sandaruwan
>>> Software Engineer,
>>> Platform Technologies,
>>> WSO2 Inc., http://wso2.com
>>> lean.enterprise.middleware
>>>
>>> email: lahirus@wso2.com cell: (+94) 773 325 954
>>> blog: http://lahiruwrites.blogspot.com/
>>> twitter: http://twitter.com/lahirus
>>> linked-in: http://lk.linkedin.com/pub/lahiru-sandaruwan/16/153/146
>>>
>>>
>>
>>
>> --
>> Best Regards,
>> Nirmal
>>
>> Nirmal Fernando.
>> PPMC Member & Committer of Apache Stratos,
>> Senior Software Engineer, WSO2 Inc.
>>
>> Blog: http://nirmalfdo.blogspot.com/
>>
>
>
>
> --
> --
> Lahiru Sandaruwan
> Software Engineer,
> Platform Technologies,
> WSO2 Inc., http://wso2.com
> lean.enterprise.middleware
>
> email: lahirus@wso2.com cell: (+94) 773 325 954
> blog: http://lahiruwrites.blogspot.com/
> twitter: http://twitter.com/lahirus
> linked-in: http://lk.linkedin.com/pub/lahiru-sandaruwan/16/153/146
>
>
--
Udara Liyanage
Software Engineer
WSO2, Inc.: http://wso2.com
lean. enterprise. middleware
web: http://udaraliyanage.wordpress.com
phone: +94 71 443 6897
Re: Fault handling scenarios for Stratos cartridge instances
Posted by Lahiru Sandaruwan <la...@wso2.com>.
Hi Manula and Udara,
Please update Jiras with the progress on this, at [1] and sub issues.
Thanks.
[1] https://issues.apache.org/jira/browse/STRATOS-144
On Tue, Oct 29, 2013 at 4:35 PM, Nirmal Fernando <ni...@gmail.com>wrote:
> +1 looks ok. We need to do OS level monitoring of the agent and keep it
> alive (to avoid process up, agent down, VM up scenario). That's an easy
> task.
>
>
> On Tue, Oct 29, 2013 at 3:52 PM, Lahiru Sandaruwan <la...@wso2.com>wrote:
>
>> Hi all,
>>
>> We(Imesh, Reka, and myself) had a small discussion on $subject while
>> working on Stratos 4.0 M1.
>>
>> This is on handling faults in VM instances. For example there can be
>> three basic faults.
>>
>> - Network Issue
>> - Application process is terminated
>> - VM itself is terminated
>>
>> Here is the decision table,
>>
>>
>> Process
>>
>> VM
>>
>> Decision flow
>>
>> Down
>>
>> Up
>>
>> -
>>
>> Cartridge agent publish event to CC
>> -
>>
>> CC updates instance status in topology
>> -
>>
>> Autoscaler decides to kill it
>>
>> Down
>>
>> Down(It can be that agent is crashed)
>>
>> -
>>
>> CEP identify that & publish event to Autoscaler
>> -
>>
>> Autoscaler calls CC to terminate(if available) and remove the
>> instance from topology
>> -
>>
>> Autoscaler will spawn another to cover that
>>
>> Up
>>
>> Up(but network issue)
>>
>> -
>>
>> CEP sends statistics on fault requests to Autoscaler
>> -
>>
>> Autoscaler keep monitoring it and takes a decision to terminate the
>> instance
>> -
>>
>> Autoscaler will spawn another to cover that
>>
>>
>>
>>
>> Feed your thoughts here...
>>
>> Thanks.
>>
>>
>>
>> --
>> --
>> Lahiru Sandaruwan
>> Software Engineer,
>> Platform Technologies,
>> WSO2 Inc., http://wso2.com
>> lean.enterprise.middleware
>>
>> email: lahirus@wso2.com cell: (+94) 773 325 954
>> blog: http://lahiruwrites.blogspot.com/
>> twitter: http://twitter.com/lahirus
>> linked-in: http://lk.linkedin.com/pub/lahiru-sandaruwan/16/153/146
>>
>>
>
>
> --
> Best Regards,
> Nirmal
>
> Nirmal Fernando.
> PPMC Member & Committer of Apache Stratos,
> Senior Software Engineer, WSO2 Inc.
>
> Blog: http://nirmalfdo.blogspot.com/
>
--
--
Lahiru Sandaruwan
Software Engineer,
Platform Technologies,
WSO2 Inc., http://wso2.com
lean.enterprise.middleware
email: lahirus@wso2.com cell: (+94) 773 325 954
blog: http://lahiruwrites.blogspot.com/
twitter: http://twitter.com/lahirus
linked-in: http://lk.linkedin.com/pub/lahiru-sandaruwan/16/153/146
Re: Fault handling scenarios for Stratos cartridge instances
Posted by Nirmal Fernando <ni...@gmail.com>.
+1 looks ok. We need to do OS level monitoring of the agent and keep it
alive (to avoid process up, agent down, VM up scenario). That's an easy
task.
On Tue, Oct 29, 2013 at 3:52 PM, Lahiru Sandaruwan <la...@wso2.com> wrote:
> Hi all,
>
> We(Imesh, Reka, and myself) had a small discussion on $subject while
> working on Stratos 4.0 M1.
>
> This is on handling faults in VM instances. For example there can be three
> basic faults.
>
> - Network Issue
> - Application process is terminated
> - VM itself is terminated
>
> Here is the decision table,
>
>
> Process
>
> VM
>
> Decision flow
>
> Down
>
> Up
>
> -
>
> Cartridge agent publish event to CC
> -
>
> CC updates instance status in topology
> -
>
> Autoscaler decides to kill it
>
> Down
>
> Down(It can be that agent is crashed)
>
> -
>
> CEP identify that & publish event to Autoscaler
> -
>
> Autoscaler calls CC to terminate(if available) and remove the instance
> from topology
> -
>
> Autoscaler will spawn another to cover that
>
> Up
>
> Up(but network issue)
>
> -
>
> CEP sends statistics on fault requests to Autoscaler
> -
>
> Autoscaler keep monitoring it and takes a decision to terminate the
> instance
> -
>
> Autoscaler will spawn another to cover that
>
>
>
>
> Feed your thoughts here...
>
> Thanks.
>
>
>
> --
> --
> Lahiru Sandaruwan
> Software Engineer,
> Platform Technologies,
> WSO2 Inc., http://wso2.com
> lean.enterprise.middleware
>
> email: lahirus@wso2.com cell: (+94) 773 325 954
> blog: http://lahiruwrites.blogspot.com/
> twitter: http://twitter.com/lahirus
> linked-in: http://lk.linkedin.com/pub/lahiru-sandaruwan/16/153/146
>
>
--
Best Regards,
Nirmal
Nirmal Fernando.
PPMC Member & Committer of Apache Stratos,
Senior Software Engineer, WSO2 Inc.
Blog: http://nirmalfdo.blogspot.com/