You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@stratos.apache.org by "Martin Eppel (meppel)" <me...@cisco.com> on 2014/10/17 20:08:52 UTC

[DISCUSS] Achieving HA or 100% (99.999%) uptime for apache stratos

I would like to discuss what it would take to achieve 100% uptime for stratos in a production environment (aiming high to reach the five nines) -  if it had been discussed before please point me to the email thread.

The goal is to identify recommended deployment scenarios and possible shortcomings (or readiness ) to reach  five nines.

This includes the following scenarios:
+ maintenance cycles,
+ upgrades,
+ hardware and software failures
+ scalability
+ ... ?

Generally, it seems the suggested system model to reach 100% uptime (or the highest possible uptime) is a n way redundancy model with multiple active / standby assignments.

I looked in the HA for 4.1,  see web link https://cwiki.apache.org/confluence/display/STRATOS/4.1.0+Providing+High+Availability+for+Stratos :

Stratos allows for 2 deployment models, single JVM and distributed deployment model.

Which one will be better suited to reach the stated goal of 100% uptime / n way redundancy model ?

According  the link (and please correct me if I am wrong), it seems that currently the components to allow n-way redundancy are:

+ BAM (doc is not updated yet, see https://docs.wso2.com/display/CLUSTER420/Clustering+BAM+2.5.0  ?

+ core component (Manager, Autoscaler, Cloud controller) in active/passive mode through Linux HA
   RDBMS used for registry needs to support n-way redundancy as well

+ ActiveMq
   multiple models suggested, Zookeeper,  shared DB or shared file systems. Which one would be recommended to achieve h-way redundancy ?

CEP seems to allow a 2 node configuration only or is there support for n-way redundancy ?

Stratos Load Balancer, lists some caveat like session affinity not supported in distributed environment, n-way ready ?

Any other component I have missed ?

What are the missing pieces to reach n-way redundancy (or 100% uptime) ?

Are there any other models to reach the stated goal and what would it take to get stratos there ?

Thanks

Martin


Re: [DISCUSS] Achieving HA or 100% (99.999%) uptime for apache stratos

Posted by Imesh Gunaratne <im...@apache.org>.
On Sun, Oct 19, 2014 at 9:26 PM, Lakmal Warusawithana <la...@wso2.com>
wrote:

>
> Imesh is working on how you can achieved active/passive with Stratos
> 4.0.0. But we are working on active/active for all stratos core with
> clustering support which going to address both high availability and
> scalability. IMO we should release it with next immediate release.
>
> +1 Will plan this for the next release Lakmal.

Thanks


-- 
Imesh Gunaratne

Technical Lead, WSO2
Committer & PMC Member, Apache Stratos

Re: [DISCUSS] Achieving HA or 100% (99.999%) uptime for apache stratos

Posted by Lakmal Warusawithana <la...@wso2.com>.
On Sat, Oct 18, 2014 at 12:19 PM, Imesh Gunaratne <im...@apache.org> wrote:

> Hi Martin,
>
> Please find my comments inline:
>
> On Fri, Oct 17, 2014 at 11:38 PM, Martin Eppel (meppel) <me...@cisco.com>
> wrote:
>
>>  I would like to discuss what it would take to achieve 100% uptime for
>> stratos in a production environment (aiming high to reach the five nines)
>> -  if it had been discussed before please point me to the email thread.
>>
>
> Unlike other software systems a quite small downtime of a PaaS might not
> affect the deployed services because it will not bring the services
> (running instances) down. However yes we need to provide 100% uptime.
>
>>
>>
>> The goal is to identify recommended deployment scenarios and possible
>> shortcomings (or readiness ) to reach  five nines.
>>
>>
>>
>> This includes the following scenarios:
>>
>> + maintenance cycles,
>>
>> + upgrades,
>>
>> + hardware and software failures
>>
>> + scalability
>>
>> + ... ?
>>
> +1 We need to address all of these
>
>>
>>
>> Generally, it seems the suggested system model to reach 100% uptime (or
>> the highest possible uptime) is a n way redundancy model with multiple
>> active / standby assignments.
>>
>>
>>
>> I looked in the HA for 4.1,  see web link
>> https://cwiki.apache.org/confluence/display/STRATOS/4.1.0+Providing+High+Availability+for+Stratos
>> :
>>
>>
>>
>> Stratos allows for 2 deployment models, single JVM and distributed
>> deployment model.
>>
>>
>>
>> Which one will be better suited to reach the stated goal of 100% uptime /
>> n way redundancy model ?
>>
>
> Deployment model is about the level of capacity Stratos can provide
> (number of instances that it can support) not the level of HA. In both
> deployment models we should be able to provide same level of HA.
>
>>
>>
>> According  the link (and please correct me if I am wrong), it seems that
>> currently the components to allow n-way redundancy are:
>>
>>
>>
>> + BAM (doc is not updated yet, see
>> https://docs.wso2.com/display/CLUSTER420/Clustering+BAM+2.5.0  ?
>>
>
> Yes we are still using BAM 2.4.1 I believe, please see the below link:
>
> https://docs.wso2.com/display/CLUSTER420/Fully-Distributed%2C+High-Availability+BAM+Setup
>
>
>>
>>
>> + core component (Manager, Autoscaler, Cloud controller) in
>> active/passive mode through Linux HA
>>    RDBMS used for registry needs to support n-way redundancy as well
>>
>
> Currently I'm doing a POC on this using Pacemaker/Heartbeat, will provide
> details soon:
> https://issues.apache.org/jira/browse/STRATOS-897
>

Imesh is working on how you can achieved active/passive with Stratos 4.0.0.
But we are working on active/active for all stratos core with clustering
support which going to address both high availability and scalability. IMO
we should release it with next immediate release.



>
>
>
>>
>> + ActiveMq
>>    multiple models suggested, Zookeeper,  shared DB or shared file
>> systems. Which one would be recommended to achieve h-way redundancy ?
>>
>
> Yes we need to do more investigations here.
>
>>
>>
>> CEP seems to allow a 2 node configuration only or is there support for
>> n-way redundancy ?
>>
>
> In distributed cache mode deployment it supports many CEP instances, will
> check on this further:
>
> https://docs.wso2.com/display/CLUSTER420/Clustering+Complex+Event+Processor#ClusteringComplexEventProcessor-Distributedcachemodedeployment
>
>
>>
>>
>> Stratos Load Balancer, lists some caveat like session affinity not
>> supported in distributed environment, n-way ready ?
>>
>>
>>
> Yes still load balancer does not have features to replicate session
> information in a distributed deployment.
>
>
> Thanks
>
> --
> Imesh Gunaratne
>
> Technical Lead, WSO2
> Committer & PMC Member, Apache Stratos
>



-- 
Lakmal Warusawithana
Vice President, Apache Stratos
Director - Cloud Architecture; WSO2 Inc.
Mobile : +94714289692
Blog : http://lakmalsview.blogspot.com/

Re: [DISCUSS] Achieving HA or 100% (99.999%) uptime for apache stratos

Posted by Imesh Gunaratne <im...@apache.org>.
Hi Martin,

Please find my comments inline:

On Fri, Oct 17, 2014 at 11:38 PM, Martin Eppel (meppel) <me...@cisco.com>
wrote:

>  I would like to discuss what it would take to achieve 100% uptime for
> stratos in a production environment (aiming high to reach the five nines)
> -  if it had been discussed before please point me to the email thread.
>

Unlike other software systems a quite small downtime of a PaaS might not
affect the deployed services because it will not bring the services
(running instances) down. However yes we need to provide 100% uptime.

>
>
> The goal is to identify recommended deployment scenarios and possible
> shortcomings (or readiness ) to reach  five nines.
>
>
>
> This includes the following scenarios:
>
> + maintenance cycles,
>
> + upgrades,
>
> + hardware and software failures
>
> + scalability
>
> + ... ?
>
+1 We need to address all of these

>
>
> Generally, it seems the suggested system model to reach 100% uptime (or
> the highest possible uptime) is a n way redundancy model with multiple
> active / standby assignments.
>
>
>
> I looked in the HA for 4.1,  see web link
> https://cwiki.apache.org/confluence/display/STRATOS/4.1.0+Providing+High+Availability+for+Stratos
> :
>
>
>
> Stratos allows for 2 deployment models, single JVM and distributed
> deployment model.
>
>
>
> Which one will be better suited to reach the stated goal of 100% uptime /
> n way redundancy model ?
>

Deployment model is about the level of capacity Stratos can provide (number
of instances that it can support) not the level of HA. In both deployment
models we should be able to provide same level of HA.

>
>
> According  the link (and please correct me if I am wrong), it seems that
> currently the components to allow n-way redundancy are:
>
>
>
> + BAM (doc is not updated yet, see
> https://docs.wso2.com/display/CLUSTER420/Clustering+BAM+2.5.0  ?
>

Yes we are still using BAM 2.4.1 I believe, please see the below link:
https://docs.wso2.com/display/CLUSTER420/Fully-Distributed%2C+High-Availability+BAM+Setup


>
>
> + core component (Manager, Autoscaler, Cloud controller) in active/passive
> mode through Linux HA
>    RDBMS used for registry needs to support n-way redundancy as well
>

Currently I'm doing a POC on this using Pacemaker/Heartbeat, will provide
details soon:
https://issues.apache.org/jira/browse/STRATOS-897


>
> + ActiveMq
>    multiple models suggested, Zookeeper,  shared DB or shared file
> systems. Which one would be recommended to achieve h-way redundancy ?
>

Yes we need to do more investigations here.

>
>
> CEP seems to allow a 2 node configuration only or is there support for
> n-way redundancy ?
>

In distributed cache mode deployment it supports many CEP instances, will
check on this further:
https://docs.wso2.com/display/CLUSTER420/Clustering+Complex+Event+Processor#ClusteringComplexEventProcessor-Distributedcachemodedeployment


>
>
> Stratos Load Balancer, lists some caveat like session affinity not
> supported in distributed environment, n-way ready ?
>
>
>
Yes still load balancer does not have features to replicate session
information in a distributed deployment.


Thanks

-- 
Imesh Gunaratne

Technical Lead, WSO2
Committer & PMC Member, Apache Stratos