You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cloudstack.apache.org by Hugo Trippaers <hu...@apache.org> on 2014/07/14 09:38:22 UTC

VPC Virtual Router and Redundancy

Hey All,

As discussed in thread http://markmail.org/message/56xrscvnmdweoxf5 some of us are working on making the VPC virtual router redundant. As part of this effort we are doing just a little bit more to get this properly done. Based on the feedback we at Schuberg Philis are getting from our colleagues we have identified a list of design goals that we would like to improve in the vpc virtual router, the most important of those are :

 * reboot proof, making sure that the router will come up with the proper configuration after a reboot without management server intervention
 * redundant, the VPC router should be able to fail-over to another device with minimum possible interruption to the service
 * introduce the new features with a smooth upgrade path for existing deployments

We are working on this with a team of developers, some of which are committers and some of which are not. So we will be working in a github repository most of the time. If you wish to keep an eye on what we are doing check the branches starting with vpc-toolkit at https://github.com/schubergphilis/cloudstack/commits/vpc-toolkit

When possible we will commit completed parts to a feature branch or the master branch to make sure we don’t diverge to much from that actual state of things.

We are currently doing a number of experiments on how to achieve our design goals and at the same time we are working on making the code a little more legible by refactoring some of the components like the VirtualNetworkApplianceManager and the VirtualRoutingResource. 

Our current thinking is to persist configuration state on the virtual router device (the actual vm) and configure the VR baed on that configuration. We intend to put most of the configuration in json files on the template and use either configuration management tools or custom scripts to do the configuration. A typical command implementation would take two steps. First push an update to a configuration file and then trigger an update script. 

Of course an important part of everything we are doing will include testing, we are already working on improving the existing unit tests for the VR and VPC code and we are setting up a procedure to test the systemvm configuration scripts as well.

We’ll do more updates like this to the list as we make progress. 

Cheers,

Hugo


Re: VPC Virtual Router and Redundancy

Posted by sebgoa <ru...@gmail.com>.
On Jul 14, 2014, at 9:38 AM, Hugo Trippaers <hu...@apache.org> wrote:

> Hey All,
> 
> As discussed in thread http://markmail.org/message/56xrscvnmdweoxf5 some of us are working on making the VPC virtual router redundant. As part of this effort we are doing just a little bit more to get this properly done. Based on the feedback we at Schuberg Philis are getting from our colleagues we have identified a list of design goals that we would like to improve in the vpc virtual router, the most important of those are :
> 
> * reboot proof, making sure that the router will come up with the proper configuration after a reboot without management server intervention
> * redundant, the VPC router should be able to fail-over to another device with minimum possible interruption to the service
> * introduce the new features with a smooth upgrade path for existing deployments

+1 to a wiki page 'design documents' that describes this feature in a bit more details 

…like all other features...

> 
> We are working on this with a team of developers, some of which are committers and some of which are not. So we will be working in a github repository most of the time. If you wish to keep an eye on what we are doing check the branches starting with vpc-toolkit at https://github.com/schubergphilis/cloudstack/commits/vpc-toolkit
> 
> When possible we will commit completed parts to a feature branch or the master branch to make sure we don’t diverge to much from that actual state of things.
> 
> We are currently doing a number of experiments on how to achieve our design goals and at the same time we are working on making the code a little more legible by refactoring some of the components like the VirtualNetworkApplianceManager and the VirtualRoutingResource. 
> 
> Our current thinking is to persist configuration state on the virtual router device (the actual vm) and configure the VR baed on that configuration. We intend to put most of the configuration in json files on the template and use either configuration management tools or custom scripts to do the configuration. A typical command implementation would take two steps. First push an update to a configuration file and then trigger an update script. 
> 
> Of course an important part of everything we are doing will include testing, we are already working on improving the existing unit tests for the VR and VPC code and we are setting up a procedure to test the systemvm configuration scripts as well.
> 
> We’ll do more updates like this to the list as we make progress. 
> 
> Cheers,
> 
> Hugo
> 


Re: VPC Virtual Router and Redundancy

Posted by Leo Simons <LS...@schubergphilis.com>.
Hey all,

High time for an overview/update on this...

Refactoring of the systemvm build process
====
@see https://github.com/apache/cloudstack/pull/16

This should be about ready to merge now. Rohit helped port to buildacloud.org and is now testing it on KVM, and will apply the pull if those tests pass. It was a prerequisite for...

Component testing for the systemvm
=====
@see https://github.com/schubergphilis/cloudstack/tree/feature/systemvm-test

I have not submitted a pull request for this yet, because it currently still depends on downloading images from our internal infrastructure — but once the build refactor is merged in, and the main builds are updated to use it, we can register those builds with vagrantcloud.com and then this can be merged too.

Integration tests using the modified systemvm
=====
@see http://markmail.org/message/zjhjwky3god6d25i

I showed this work to Ian, Rohit and Wido last week. We’re having some (infrastructure) issues getting these tests stable so it’s not quite ready for open sourcing. However since it’s taking a bit longer than I wanted I’ll look into getting the code opened up anyway.

Reboot proof systemvm
=====
@see https://github.com/schubergphilis/cloudstack/tree/feature/systemvm-persistent-config
@see http://markmail.org/message/n4sohmhowzm244iq

We’re at the point now where we are probably pretty happy with how this looks. update_config.py...

  https://github.com/schubergphilis/cloudstack/blob/feature/systemvm-persistent-config/systemvm/patches/debian/config/opt/cloud/bin/update_config.py

is becoming the sole entry point to making network-related systemvm configuration changes. It’s a wrapper around configure.py...

  https://github.com/schubergphilis/cloudstack/blob/feature/systemvm-persistent-config/systemvm/patches/debian/config/opt/cloud/bin/configure.py#L974

Many of the networking changes are now done by uploading a JSON configuration describing the desired state, and then executing this script to converge the systemvm to the desired state. This makes the lower layers on the java side of changing the systemvm look rather neater too, for example

  https://github.com/schubergphilis/cloudstack/blob/e82bfb410b758dce372c4203febecb373880594b/core/src/com/cloud/agent/resource/virtualnetwork/ConfigHelper.java#L102

There’s component tests for this new way of applying config changes

  https://github.com/schubergphilis/cloudstack/blob/feature/systemvm-persistent-config/test/systemvm/test_update_config.py

and those make us pretty hopeful all the changes are correct. However, we need to spend more time on integration testing, to test the end result is still behaving exactly as it should and everything is fully backward compatible.

VPC java code refactoring
=====
@see https://github.com/apache/cloudstack/pull/19
(old https://github.com/apache/cloudstack/pull/18
     https://github.com/apache/cloudstack/pull/14)

I suggest reading the pull requests for details...this is meant to be a fully backwards-compatible code cleanup that’s independent of the other changes mentioned above. We wanted to get unit test coverage up on this code before making functional changes to it, working to de-duplicate vpc and non-vpc networking code as much as possible. We ended up introducing a pretty neat new router deployment definition setup.

Redundant routing
=====
With these two tasks pretty much out of the way, while we still have a bunch of integration testing to finish, we’re now also starting in earnest on making the result actually redundant.

We’re getting the ‘building blocks’ for redundancy (vrrp, heartbeat, ...) into the configure.py framework

  https://github.com/schubergphilis/cloudstack/commit/c95f5e208640f49a9f1855a7976ff0518184bdf1

and after that we’ll probably spend a bit of time experimenting with how to best put them together. Once that design solidifies (though I imagine it will not deviate much from the previous discussions we had on #cloudstack-meeting) it’ll be time for another update :-)


cheers,


Leo

On Jul 14, 2014, at 9:38 AM, Hugo Trippaers <hu...@apache.org> wrote:
> Hey All,
> 
> As discussed in thread http://markmail.org/message/56xrscvnmdweoxf5 some of us are working on making the VPC virtual router redundant. As part of this effort we are doing just a little bit more to get this properly done. Based on the feedback we at Schuberg Philis are getting from our colleagues we have identified a list of design goals that we would like to improve in the vpc virtual router, the most important of those are :
> 
> * reboot proof, making sure that the router will come up with the proper configuration after a reboot without management server intervention
> * redundant, the VPC router should be able to fail-over to another device with minimum possible interruption to the service
> * introduce the new features with a smooth upgrade path for existing deployments
> 
> We are working on this with a team of developers, some of which are committers and some of which are not. So we will be working in a github repository most of the time. If you wish to keep an eye on what we are doing check the branches starting with vpc-toolkit at https://github.com/schubergphilis/cloudstack/commits/vpc-toolkit
> 
> When possible we will commit completed parts to a feature branch or the master branch to make sure we don’t diverge to much from that actual state of things.
> 
> We are currently doing a number of experiments on how to achieve our design goals and at the same time we are working on making the code a little more legible by refactoring some of the components like the VirtualNetworkApplianceManager and the VirtualRoutingResource. 
> 
> Our current thinking is to persist configuration state on the virtual router device (the actual vm) and configure the VR baed on that configuration. We intend to put most of the configuration in json files on the template and use either configuration management tools or custom scripts to do the configuration. A typical command implementation would take two steps. First push an update to a configuration file and then trigger an update script. 
> 
> Of course an important part of everything we are doing will include testing, we are already working on improving the existing unit tests for the VR and VPC code and we are setting up a procedure to test the systemvm configuration scripts as well.
> 
> We’ll do more updates like this to the list as we make progress. 
> 
> Cheers,
> 
> Hugo
>