You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@stratos.apache.org by Lahiru Sandaruwan <la...@wso2.com> on 2013/10/29 11:22:09 UTC

Fault handling scenarios for Stratos cartridge instances

Hi all,

We(Imesh, Reka, and myself) had a small discussion on $subject while
working on Stratos 4.0 M1.

This is on handling faults in VM instances. For example there can be three
basic faults.

   - Network Issue
   - Application process is terminated
   - VM itself is terminated

Here is the decision table,


 Process

VM

Decision flow

Down

Up

   -

   Cartridge agent publish event to CC
   -

   CC updates instance status in topology
   -

   Autoscaler decides to kill it

Down

Down(It can be that agent is crashed)

   -

   CEP identify that & publish event to Autoscaler
   -

   Autoscaler calls CC to terminate(if available) and remove the instance
   from topology
   -

   Autoscaler will spawn another to cover that

Up

Up(but network issue)

   -

   CEP sends statistics on fault requests to Autoscaler
   -

   Autoscaler keep monitoring it and takes a decision to terminate the
   instance
   -

   Autoscaler will spawn another to cover that




Feed your thoughts here...

Thanks.



-- 
--
Lahiru Sandaruwan
Software Engineer,
Platform Technologies,
WSO2 Inc., http://wso2.com
lean.enterprise.middleware

email: lahirus@wso2.com cell: (+94) 773 325 954
blog: http://lahiruwrites.blogspot.com/
twitter: http://twitter.com/lahirus
linked-in: http://lk.linkedin.com/pub/lahiru-sandaruwan/16/153/146

Re: Fault handling scenarios for Stratos cartridge instances

Posted by Udara Liyanage <ud...@wso2.com>.
Does n't Auto scaler spawn a new instance in the first case of the above
fault table.?


On Thu, Nov 21, 2013 at 1:16 AM, Lahiru Sandaruwan <la...@wso2.com> wrote:

> Hi Manula and Udara,
>
> Please update Jiras with the progress on this, at [1] and sub issues.
>
> Thanks.
> [1] https://issues.apache.org/jira/browse/STRATOS-144
>
>
> On Tue, Oct 29, 2013 at 4:35 PM, Nirmal Fernando <ni...@gmail.com>wrote:
>
>> +1 looks ok. We need to do OS level monitoring of the agent and keep it
>> alive (to avoid process up, agent down, VM up scenario). That's an easy
>> task.
>>
>>
>> On Tue, Oct 29, 2013 at 3:52 PM, Lahiru Sandaruwan <la...@wso2.com>wrote:
>>
>>> Hi all,
>>>
>>> We(Imesh, Reka, and myself) had a small discussion on $subject while
>>> working on Stratos 4.0 M1.
>>>
>>> This is on handling faults in VM instances. For example there can be
>>> three basic faults.
>>>
>>>    - Network Issue
>>>    - Application process is terminated
>>>    - VM itself is terminated
>>>
>>> Here is the decision table,
>>>
>>>
>>>  Process
>>>
>>> VM
>>>
>>> Decision flow
>>>
>>> Down
>>>
>>> Up
>>>
>>>    -
>>>
>>>    Cartridge agent publish event to CC
>>>    -
>>>
>>>    CC updates instance status in topology
>>>    -
>>>
>>>    Autoscaler decides to kill it
>>>
>>> Down
>>>
>>> Down(It can be that agent is crashed)
>>>
>>>    -
>>>
>>>    CEP identify that & publish event to Autoscaler
>>>    -
>>>
>>>    Autoscaler calls CC to terminate(if available) and remove the
>>>    instance from topology
>>>    -
>>>
>>>    Autoscaler will spawn another to cover that
>>>
>>> Up
>>>
>>> Up(but network issue)
>>>
>>>    -
>>>
>>>    CEP sends statistics on fault requests to Autoscaler
>>>    -
>>>
>>>    Autoscaler keep monitoring it and takes a decision to terminate the
>>>    instance
>>>    -
>>>
>>>    Autoscaler will spawn another to cover that
>>>
>>>
>>>
>>>
>>> Feed your thoughts here...
>>>
>>> Thanks.
>>>
>>>
>>>
>>> --
>>> --
>>> Lahiru Sandaruwan
>>> Software Engineer,
>>> Platform Technologies,
>>> WSO2 Inc., http://wso2.com
>>> lean.enterprise.middleware
>>>
>>> email: lahirus@wso2.com cell: (+94) 773 325 954
>>> blog: http://lahiruwrites.blogspot.com/
>>> twitter: http://twitter.com/lahirus
>>> linked-in: http://lk.linkedin.com/pub/lahiru-sandaruwan/16/153/146
>>>
>>>
>>
>>
>> --
>> Best Regards,
>> Nirmal
>>
>> Nirmal Fernando.
>> PPMC Member & Committer of Apache Stratos,
>> Senior Software Engineer, WSO2 Inc.
>>
>> Blog: http://nirmalfdo.blogspot.com/
>>
>
>
>
> --
> --
> Lahiru Sandaruwan
> Software Engineer,
> Platform Technologies,
> WSO2 Inc., http://wso2.com
> lean.enterprise.middleware
>
> email: lahirus@wso2.com cell: (+94) 773 325 954
> blog: http://lahiruwrites.blogspot.com/
> twitter: http://twitter.com/lahirus
> linked-in: http://lk.linkedin.com/pub/lahiru-sandaruwan/16/153/146
>
>


-- 
Udara Liyanage
Software Engineer
WSO2, Inc.: http://wso2.com
lean. enterprise. middleware

web: http://udaraliyanage.wordpress.com
phone: +94 71 443 6897

Re: Fault handling scenarios for Stratos cartridge instances

Posted by Lahiru Sandaruwan <la...@wso2.com>.
Hi Manula and Udara,

Please update Jiras with the progress on this, at [1] and sub issues.

Thanks.
[1] https://issues.apache.org/jira/browse/STRATOS-144


On Tue, Oct 29, 2013 at 4:35 PM, Nirmal Fernando <ni...@gmail.com>wrote:

> +1 looks ok. We need to do OS level monitoring of the agent and keep it
> alive (to avoid process up, agent down, VM up scenario). That's an easy
> task.
>
>
> On Tue, Oct 29, 2013 at 3:52 PM, Lahiru Sandaruwan <la...@wso2.com>wrote:
>
>> Hi all,
>>
>> We(Imesh, Reka, and myself) had a small discussion on $subject while
>> working on Stratos 4.0 M1.
>>
>> This is on handling faults in VM instances. For example there can be
>> three basic faults.
>>
>>    - Network Issue
>>    - Application process is terminated
>>    - VM itself is terminated
>>
>> Here is the decision table,
>>
>>
>>  Process
>>
>> VM
>>
>> Decision flow
>>
>> Down
>>
>> Up
>>
>>    -
>>
>>    Cartridge agent publish event to CC
>>    -
>>
>>    CC updates instance status in topology
>>    -
>>
>>    Autoscaler decides to kill it
>>
>> Down
>>
>> Down(It can be that agent is crashed)
>>
>>    -
>>
>>    CEP identify that & publish event to Autoscaler
>>    -
>>
>>    Autoscaler calls CC to terminate(if available) and remove the
>>    instance from topology
>>    -
>>
>>    Autoscaler will spawn another to cover that
>>
>> Up
>>
>> Up(but network issue)
>>
>>    -
>>
>>    CEP sends statistics on fault requests to Autoscaler
>>    -
>>
>>    Autoscaler keep monitoring it and takes a decision to terminate the
>>    instance
>>    -
>>
>>    Autoscaler will spawn another to cover that
>>
>>
>>
>>
>> Feed your thoughts here...
>>
>> Thanks.
>>
>>
>>
>> --
>> --
>> Lahiru Sandaruwan
>> Software Engineer,
>> Platform Technologies,
>> WSO2 Inc., http://wso2.com
>> lean.enterprise.middleware
>>
>> email: lahirus@wso2.com cell: (+94) 773 325 954
>> blog: http://lahiruwrites.blogspot.com/
>> twitter: http://twitter.com/lahirus
>> linked-in: http://lk.linkedin.com/pub/lahiru-sandaruwan/16/153/146
>>
>>
>
>
> --
> Best Regards,
> Nirmal
>
> Nirmal Fernando.
> PPMC Member & Committer of Apache Stratos,
> Senior Software Engineer, WSO2 Inc.
>
> Blog: http://nirmalfdo.blogspot.com/
>



-- 
--
Lahiru Sandaruwan
Software Engineer,
Platform Technologies,
WSO2 Inc., http://wso2.com
lean.enterprise.middleware

email: lahirus@wso2.com cell: (+94) 773 325 954
blog: http://lahiruwrites.blogspot.com/
twitter: http://twitter.com/lahirus
linked-in: http://lk.linkedin.com/pub/lahiru-sandaruwan/16/153/146

Re: Fault handling scenarios for Stratos cartridge instances

Posted by Nirmal Fernando <ni...@gmail.com>.
+1 looks ok. We need to do OS level monitoring of the agent and keep it
alive (to avoid process up, agent down, VM up scenario). That's an easy
task.


On Tue, Oct 29, 2013 at 3:52 PM, Lahiru Sandaruwan <la...@wso2.com> wrote:

> Hi all,
>
> We(Imesh, Reka, and myself) had a small discussion on $subject while
> working on Stratos 4.0 M1.
>
> This is on handling faults in VM instances. For example there can be three
> basic faults.
>
>    - Network Issue
>    - Application process is terminated
>    - VM itself is terminated
>
> Here is the decision table,
>
>
>  Process
>
> VM
>
> Decision flow
>
> Down
>
> Up
>
>    -
>
>    Cartridge agent publish event to CC
>    -
>
>    CC updates instance status in topology
>    -
>
>    Autoscaler decides to kill it
>
> Down
>
> Down(It can be that agent is crashed)
>
>    -
>
>    CEP identify that & publish event to Autoscaler
>    -
>
>    Autoscaler calls CC to terminate(if available) and remove the instance
>    from topology
>    -
>
>    Autoscaler will spawn another to cover that
>
> Up
>
> Up(but network issue)
>
>    -
>
>    CEP sends statistics on fault requests to Autoscaler
>    -
>
>    Autoscaler keep monitoring it and takes a decision to terminate the
>    instance
>    -
>
>    Autoscaler will spawn another to cover that
>
>
>
>
> Feed your thoughts here...
>
> Thanks.
>
>
>
> --
> --
> Lahiru Sandaruwan
> Software Engineer,
> Platform Technologies,
> WSO2 Inc., http://wso2.com
> lean.enterprise.middleware
>
> email: lahirus@wso2.com cell: (+94) 773 325 954
> blog: http://lahiruwrites.blogspot.com/
> twitter: http://twitter.com/lahirus
> linked-in: http://lk.linkedin.com/pub/lahiru-sandaruwan/16/153/146
>
>


-- 
Best Regards,
Nirmal

Nirmal Fernando.
PPMC Member & Committer of Apache Stratos,
Senior Software Engineer, WSO2 Inc.

Blog: http://nirmalfdo.blogspot.com/