You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cloudstack.apache.org by Jonas Schlichtenbrede <jo...@gmail.com> on 2016/07/21 11:55:54 UTC

Working Site-to-Site VPN gets disconnected and VPC seems to forgets ACL’s

Hi CloudStack Users and Developers,

we’re currently implementing a new CloudStack environment based on 4.8.0.1
(System VM Template is 4.6) with XenServer 6.5 SP1 and all the latest
updates.

So far everything works as expected we only have an issue regarding the
stability of Site-to-Site VPNs within VPCs and we think ACL’s.

I’ll try to describe the problem and behaviour:

A connected and working S2S VPN switches to disconnected after some time
(usually a few hours). In relation to that the VPC seems to “forget” it’s
ACLs. Restarting only the Network Tier (a VM lives within) solves the
issues for a short period of time (1-3 hours). The state of the VPN
switches to connected and the S2S VPN is working again. Also pinging from
the VM to any public address is working again. Strange is, that for example
browsing to a website is working all the time. Isolated networks however
work like a charm.

We tried to solve this issue through several tests. We changing the network
setup and reducing the complexity just to get this behaviour isolated. But
it’s always the same. We also tried several different connections to
different customer gateways (firewalls) and a VPC-VPN to VPC-VPN connection
to another CloudStack deployment (based on Version 4.5.2) without any
success.

In addition, we tested several setups like CentOS 6 and CentOS 7, but again
always the same. We updated one installation to the master from yesterday
4.9.0.0-snapshot – again no success. We do not have any issues with version
4.5.2 – but this installation is in a different datacentre.

Below you’ll find some logs – the relevant IP for this test connection is:
*85.88.16.104*

CloudStack 4.8.0.1 Logs (Google Docs):

https://drive.google.com/open?id=1gqIjDdG1htps4p1t7m1uHSs7aNHplWp1Np83nH6e7zM


IPsec Logs from the Virtual Router:
https://drive.google.com/open?id=1ZWvhFu2P_Wv_lF8TgYMmexeS_KDag1Mp-kmuhl8l7uU


Thank you in advance for your help!

Jonas

PS: If possible from your site we can do a remote session to take a look at
the setup.

Re: Working Site-to-Site VPN gets disconnected and VPC seems to forgets ACL’s

Posted by Jonas Schlichtenbrede <jo...@gmail.com>.
Hi Simon,

thanks for your reply.Currently we don't use private gateways - this
deployment never made it that far ;-)

It's always a very simple setup because we are testing several scenarios
(especially regarding the network setup) to find a solution for this issue.

@ Jayapal

You'll find the output of ipsec status via the following Google Drive
folder including some new logs:

   -
   https://drive.google.com/folderview?id=0BzK2SWUHninKaGpSS0twQlR0NUE&usp=sharing

For a new try the connection was established at 25.07.2016 arround 16:38:00
and broke at 25.07.2016 arround 19:54:00. Logs from the VR are 2 hours
behind.

We'll so far we do not have any other ideas how to get this issue isolated.
Maybe it's related to the hardware of the XenServer hosts: Intel i350
Gigabit Network NICs are used. However we use these cards at working setup
too.

Is it possible to take a look at the setup together?

On Fri, Jul 22, 2016 at 8:29 PM, Simon Weller <sw...@ena.com> wrote:

> Do you use private gateways as well in your VPC environment? If so, does
> the same ACL problem occur there as well?
>
>
> ________________________________
> From: Jonas Schlichtenbrede <jo...@gmail.com>
> Sent: Friday, July 22, 2016 11:18 AM
> To: dev@cloudstack.apache.org
> Subject: Re: Working Site-to-Site VPN gets disconnected and VPC seems to
> forgets ACL’s
>
> Hi Jayapal,
>
> thanks for you feedback!
>
> We already tested the VPN with and without dead period detection - always
> the same behaviour. I'll try 'ipsec auto --status' to see the output.
>
> Browsing is just browsing to a website from within the VM (Windows Server
> VM + IE). This works even if the VPN switches to disconnected...
>
> At the moment we use NIC Bonding with the XenServers, but of course we
> disabled one switch just to be sure that there is no general network or
> switch issue. At the beginning we tested this setup without NIC Bonding,
> too (again the same issues).
>
> The strange thing is that everything is working for a few hours and then
> just stops and a simple restart of the VPC Tier fixes it.
>
> Is there a way to debug/analyse why the VR cuts/drops the connections or at
> which stage (Xen, Switch, Top of the Rack Switch,...)?
>
> Thanks
> Jonas
>
> PS: At guest networks we encountered that for example an active RDP session
> (port forwardings in general) stopp working at the same time. Again
> browsing to a website from a VM inside such a guest network is still
> working...
>
>
> On Fri, Jul 22, 2016 at 7:17 AM, Jayapal Uradi <
> jayapal.uradi@accelerite.com
> > wrote:
>
> > Hi Jonas,
> >
> > It seems the connection is going down because the dead period detection.
> >
> > In router run the command 'ipsec auto —status’  to vpn connection
> status.
> > When the connection is down initiate traffic from the guest vm to other
> > end of vpn and go to router check the ipsec vpn status (ipsec auto
> —status).
> > This gives wether the connection is up or not in the VR.  It takes router
> > status get interval to update the VPN status.
> >
> > The browsing you mentioned is about browsing the other end of vpn
> servers ?
> >
> > Thanks,
> > Jayapal
> >
> > > On Jul 21, 2016, at 5:25 PM, Jonas Schlichtenbrede <
> > jonas.schlichtenbrede@gmail.com> wrote:
> > >
> > > Hi CloudStack Users and Developers,
> > >
> > > we’re currently implementing a new CloudStack environment based on
> > 4.8.0.1
> > > (System VM Template is 4.6) with XenServer 6.5 SP1 and all the latest
> > > updates.
> > >
> > > So far everything works as expected we only have an issue regarding the
> > > stability of Site-to-Site VPNs within VPCs and we think ACL’s.
> > >
> > > I’ll try to describe the problem and behaviour:
> > >
> > > A connected and working S2S VPN switches to disconnected after some
> time
> > > (usually a few hours). In relation to that the VPC seems to “forget”
> it’s
> > > ACLs. Restarting only the Network Tier (a VM lives within) solves the
> > > issues for a short period of time (1-3 hours). The state of the VPN
> > > switches to connected and the S2S VPN is working again. Also pinging
> from
> > > the VM to any public address is working again. Strange is, that for
> > example
> > > browsing to a website is working all the time. Isolated networks
> however
> > > work like a charm.
> > >
> > > We tried to solve this issue through several tests. We changing the
> > network
> > > setup and reducing the complexity just to get this behaviour isolated.
> > But
> > > it’s always the same. We also tried several different connections to
> > > different customer gateways (firewalls) and a VPC-VPN to VPC-VPN
> > connection
> > > to another CloudStack deployment (based on Version 4.5.2) without any
> > > success.
> > >
> > > In addition, we tested several setups like CentOS 6 and CentOS 7, but
> > again
> > > always the same. We updated one installation to the master from
> yesterday
> > > 4.9.0.0-snapshot – again no success. We do not have any issues with
> > version
> > > 4.5.2 – but this installation is in a different datacentre.
> > >
> > > Below you’ll find some logs – the relevant IP for this test connection
> > is:
> > > *85.88.16.104*
> > >
> > > CloudStack 4.8.0.1 Logs (Google Docs):
> > >
> > >
> >
> https://drive.google.com/open?id=1gqIjDdG1htps4p1t7m1uHSs7aNHplWp1Np83nH6e7zM
> [
> https://lh4.googleusercontent.com/exnf2bX69PTpe-2SCy0IkHIPWjrlJX4t4KGprTaqiFH1C9pCF5QeiXZkmjOGQrO-E4MJ_Q=w1200-h630-p
> ]<
> https://drive.google.com/open?id=1gqIjDdG1htps4p1t7m1uHSs7aNHplWp1Np83nH6e7zM
> >
>
> CloudStack - Management Server Logs<
> https://drive.google.com/open?id=1gqIjDdG1htps4p1t7m1uHSs7aNHplWp1Np83nH6e7zM
> >
> drive.google.com
> Working Site-to-Site VPN gets disconnected CloudStack Logs -- 19.07.2016
> Links to full Cloudstack Logs CloudStack Logs -- 19.07.2016 CloudStack Logs
> from 2016-07-19 2016-07-19 21:29:54,505 DEBUG [c.c.a.t.Request]
> (RouterStatusMonitor-1:ctx-6d6037bf) (logid:b1054a97) Seq
> 1-886167669178158...
>
>
>
> > >
> > >
> > > IPsec Logs from the Virtual Router:
> > >
> >
> https://drive.google.com/open?id=1ZWvhFu2P_Wv_lF8TgYMmexeS_KDag1Mp-kmuhl8l7uU
> > >
> > >
> > > Thank you in advance for your help!
> > >
> > > Jonas
> > >
> > > PS: If possible from your site we can do a remote session to take a
> look
> > at
> > > the setup.
> >
> >
> >
> >
> > DISCLAIMER
> > ==========
> > This e-mail may contain privileged and confidential information which is
> > the property of Accelerite, a Persistent Systems business. It is intended
> > only for the use of the individual or entity to which it is addressed. If
> > you are not the intended recipient, you are not authorized to read,
> retain,
> > copy, print, distribute or use this message. If you have received this
> > communication in error, please notify the sender and delete all copies of
> > this message. Accelerite, a Persistent Systems business does not accept
> any
> > liability for virus infected mails.
> >
>

Re: Working Site-to-Site VPN gets disconnected and VPC seems to forgets ACL’s

Posted by Simon Weller <sw...@ena.com>.
Do you use private gateways as well in your VPC environment? If so, does the same ACL problem occur there as well?


________________________________
From: Jonas Schlichtenbrede <jo...@gmail.com>
Sent: Friday, July 22, 2016 11:18 AM
To: dev@cloudstack.apache.org
Subject: Re: Working Site-to-Site VPN gets disconnected and VPC seems to forgets ACL’s

Hi Jayapal,

thanks for you feedback!

We already tested the VPN with and without dead period detection - always
the same behaviour. I'll try 'ipsec auto --status' to see the output.

Browsing is just browsing to a website from within the VM (Windows Server
VM + IE). This works even if the VPN switches to disconnected...

At the moment we use NIC Bonding with the XenServers, but of course we
disabled one switch just to be sure that there is no general network or
switch issue. At the beginning we tested this setup without NIC Bonding,
too (again the same issues).

The strange thing is that everything is working for a few hours and then
just stops and a simple restart of the VPC Tier fixes it.

Is there a way to debug/analyse why the VR cuts/drops the connections or at
which stage (Xen, Switch, Top of the Rack Switch,...)?

Thanks
Jonas

PS: At guest networks we encountered that for example an active RDP session
(port forwardings in general) stopp working at the same time. Again
browsing to a website from a VM inside such a guest network is still
working...


On Fri, Jul 22, 2016 at 7:17 AM, Jayapal Uradi <jayapal.uradi@accelerite.com
> wrote:

> Hi Jonas,
>
> It seems the connection is going down because the dead period detection.
>
> In router run the command 'ipsec auto —status’  to vpn connection  status.
> When the connection is down initiate traffic from the guest vm to other
> end of vpn and go to router check the ipsec vpn status (ipsec auto —status).
> This gives wether the connection is up or not in the VR.  It takes router
> status get interval to update the VPN status.
>
> The browsing you mentioned is about browsing the other end of vpn servers ?
>
> Thanks,
> Jayapal
>
> > On Jul 21, 2016, at 5:25 PM, Jonas Schlichtenbrede <
> jonas.schlichtenbrede@gmail.com> wrote:
> >
> > Hi CloudStack Users and Developers,
> >
> > we’re currently implementing a new CloudStack environment based on
> 4.8.0.1
> > (System VM Template is 4.6) with XenServer 6.5 SP1 and all the latest
> > updates.
> >
> > So far everything works as expected we only have an issue regarding the
> > stability of Site-to-Site VPNs within VPCs and we think ACL’s.
> >
> > I’ll try to describe the problem and behaviour:
> >
> > A connected and working S2S VPN switches to disconnected after some time
> > (usually a few hours). In relation to that the VPC seems to “forget” it’s
> > ACLs. Restarting only the Network Tier (a VM lives within) solves the
> > issues for a short period of time (1-3 hours). The state of the VPN
> > switches to connected and the S2S VPN is working again. Also pinging from
> > the VM to any public address is working again. Strange is, that for
> example
> > browsing to a website is working all the time. Isolated networks however
> > work like a charm.
> >
> > We tried to solve this issue through several tests. We changing the
> network
> > setup and reducing the complexity just to get this behaviour isolated.
> But
> > it’s always the same. We also tried several different connections to
> > different customer gateways (firewalls) and a VPC-VPN to VPC-VPN
> connection
> > to another CloudStack deployment (based on Version 4.5.2) without any
> > success.
> >
> > In addition, we tested several setups like CentOS 6 and CentOS 7, but
> again
> > always the same. We updated one installation to the master from yesterday
> > 4.9.0.0-snapshot – again no success. We do not have any issues with
> version
> > 4.5.2 – but this installation is in a different datacentre.
> >
> > Below you’ll find some logs – the relevant IP for this test connection
> is:
> > *85.88.16.104*
> >
> > CloudStack 4.8.0.1 Logs (Google Docs):
> >
> >
> https://drive.google.com/open?id=1gqIjDdG1htps4p1t7m1uHSs7aNHplWp1Np83nH6e7zM
[https://lh4.googleusercontent.com/exnf2bX69PTpe-2SCy0IkHIPWjrlJX4t4KGprTaqiFH1C9pCF5QeiXZkmjOGQrO-E4MJ_Q=w1200-h630-p]<https://drive.google.com/open?id=1gqIjDdG1htps4p1t7m1uHSs7aNHplWp1Np83nH6e7zM>

CloudStack - Management Server Logs<https://drive.google.com/open?id=1gqIjDdG1htps4p1t7m1uHSs7aNHplWp1Np83nH6e7zM>
drive.google.com
Working Site-to-Site VPN gets disconnected CloudStack Logs -- 19.07.2016 Links to full Cloudstack Logs CloudStack Logs -- 19.07.2016 CloudStack Logs from 2016-07-19 2016-07-19 21:29:54,505 DEBUG [c.c.a.t.Request] (RouterStatusMonitor-1:ctx-6d6037bf) (logid:b1054a97) Seq 1-886167669178158...



> >
> >
> > IPsec Logs from the Virtual Router:
> >
> https://drive.google.com/open?id=1ZWvhFu2P_Wv_lF8TgYMmexeS_KDag1Mp-kmuhl8l7uU
> >
> >
> > Thank you in advance for your help!
> >
> > Jonas
> >
> > PS: If possible from your site we can do a remote session to take a look
> at
> > the setup.
>
>
>
>
> DISCLAIMER
> ==========
> This e-mail may contain privileged and confidential information which is
> the property of Accelerite, a Persistent Systems business. It is intended
> only for the use of the individual or entity to which it is addressed. If
> you are not the intended recipient, you are not authorized to read, retain,
> copy, print, distribute or use this message. If you have received this
> communication in error, please notify the sender and delete all copies of
> this message. Accelerite, a Persistent Systems business does not accept any
> liability for virus infected mails.
>

Re: Working Site-to-Site VPN gets disconnected and VPC seems to forgets ACL’s

Posted by Jonas Schlichtenbrede <jo...@gmail.com>.
Hi Jayapal,

thanks for you feedback!

We already tested the VPN with and without dead period detection - always
the same behaviour. I'll try 'ipsec auto --status' to see the output.

Browsing is just browsing to a website from within the VM (Windows Server
VM + IE). This works even if the VPN switches to disconnected...

At the moment we use NIC Bonding with the XenServers, but of course we
disabled one switch just to be sure that there is no general network or
switch issue. At the beginning we tested this setup without NIC Bonding,
too (again the same issues).

The strange thing is that everything is working for a few hours and then
just stops and a simple restart of the VPC Tier fixes it.

Is there a way to debug/analyse why the VR cuts/drops the connections or at
which stage (Xen, Switch, Top of the Rack Switch,...)?

Thanks
Jonas

PS: At guest networks we encountered that for example an active RDP session
(port forwardings in general) stopp working at the same time. Again
browsing to a website from a VM inside such a guest network is still
working...


On Fri, Jul 22, 2016 at 7:17 AM, Jayapal Uradi <jayapal.uradi@accelerite.com
> wrote:

> Hi Jonas,
>
> It seems the connection is going down because the dead period detection.
>
> In router run the command 'ipsec auto —status’  to vpn connection  status.
> When the connection is down initiate traffic from the guest vm to other
> end of vpn and go to router check the ipsec vpn status (ipsec auto —status).
> This gives wether the connection is up or not in the VR.  It takes router
> status get interval to update the VPN status.
>
> The browsing you mentioned is about browsing the other end of vpn servers ?
>
> Thanks,
> Jayapal
>
> > On Jul 21, 2016, at 5:25 PM, Jonas Schlichtenbrede <
> jonas.schlichtenbrede@gmail.com> wrote:
> >
> > Hi CloudStack Users and Developers,
> >
> > we’re currently implementing a new CloudStack environment based on
> 4.8.0.1
> > (System VM Template is 4.6) with XenServer 6.5 SP1 and all the latest
> > updates.
> >
> > So far everything works as expected we only have an issue regarding the
> > stability of Site-to-Site VPNs within VPCs and we think ACL’s.
> >
> > I’ll try to describe the problem and behaviour:
> >
> > A connected and working S2S VPN switches to disconnected after some time
> > (usually a few hours). In relation to that the VPC seems to “forget” it’s
> > ACLs. Restarting only the Network Tier (a VM lives within) solves the
> > issues for a short period of time (1-3 hours). The state of the VPN
> > switches to connected and the S2S VPN is working again. Also pinging from
> > the VM to any public address is working again. Strange is, that for
> example
> > browsing to a website is working all the time. Isolated networks however
> > work like a charm.
> >
> > We tried to solve this issue through several tests. We changing the
> network
> > setup and reducing the complexity just to get this behaviour isolated.
> But
> > it’s always the same. We also tried several different connections to
> > different customer gateways (firewalls) and a VPC-VPN to VPC-VPN
> connection
> > to another CloudStack deployment (based on Version 4.5.2) without any
> > success.
> >
> > In addition, we tested several setups like CentOS 6 and CentOS 7, but
> again
> > always the same. We updated one installation to the master from yesterday
> > 4.9.0.0-snapshot – again no success. We do not have any issues with
> version
> > 4.5.2 – but this installation is in a different datacentre.
> >
> > Below you’ll find some logs – the relevant IP for this test connection
> is:
> > *85.88.16.104*
> >
> > CloudStack 4.8.0.1 Logs (Google Docs):
> >
> >
> https://drive.google.com/open?id=1gqIjDdG1htps4p1t7m1uHSs7aNHplWp1Np83nH6e7zM
> >
> >
> > IPsec Logs from the Virtual Router:
> >
> https://drive.google.com/open?id=1ZWvhFu2P_Wv_lF8TgYMmexeS_KDag1Mp-kmuhl8l7uU
> >
> >
> > Thank you in advance for your help!
> >
> > Jonas
> >
> > PS: If possible from your site we can do a remote session to take a look
> at
> > the setup.
>
>
>
>
> DISCLAIMER
> ==========
> This e-mail may contain privileged and confidential information which is
> the property of Accelerite, a Persistent Systems business. It is intended
> only for the use of the individual or entity to which it is addressed. If
> you are not the intended recipient, you are not authorized to read, retain,
> copy, print, distribute or use this message. If you have received this
> communication in error, please notify the sender and delete all copies of
> this message. Accelerite, a Persistent Systems business does not accept any
> liability for virus infected mails.
>

Re: Working Site-to-Site VPN gets disconnected and VPC seems to forgets ACL’s

Posted by Jayapal Uradi <ja...@accelerite.com>.
Hi Jonas,

It seems the connection is going down because the dead period detection.

In router run the command 'ipsec auto —status’  to vpn connection  status.
When the connection is down initiate traffic from the guest vm to other end of vpn and go to router check the ipsec vpn status (ipsec auto —status).
This gives wether the connection is up or not in the VR.  It takes router status get interval to update the VPN status.

The browsing you mentioned is about browsing the other end of vpn servers ?

Thanks,
Jayapal

> On Jul 21, 2016, at 5:25 PM, Jonas Schlichtenbrede <jo...@gmail.com> wrote:
> 
> Hi CloudStack Users and Developers,
> 
> we’re currently implementing a new CloudStack environment based on 4.8.0.1
> (System VM Template is 4.6) with XenServer 6.5 SP1 and all the latest
> updates.
> 
> So far everything works as expected we only have an issue regarding the
> stability of Site-to-Site VPNs within VPCs and we think ACL’s.
> 
> I’ll try to describe the problem and behaviour:
> 
> A connected and working S2S VPN switches to disconnected after some time
> (usually a few hours). In relation to that the VPC seems to “forget” it’s
> ACLs. Restarting only the Network Tier (a VM lives within) solves the
> issues for a short period of time (1-3 hours). The state of the VPN
> switches to connected and the S2S VPN is working again. Also pinging from
> the VM to any public address is working again. Strange is, that for example
> browsing to a website is working all the time. Isolated networks however
> work like a charm.
> 
> We tried to solve this issue through several tests. We changing the network
> setup and reducing the complexity just to get this behaviour isolated. But
> it’s always the same. We also tried several different connections to
> different customer gateways (firewalls) and a VPC-VPN to VPC-VPN connection
> to another CloudStack deployment (based on Version 4.5.2) without any
> success.
> 
> In addition, we tested several setups like CentOS 6 and CentOS 7, but again
> always the same. We updated one installation to the master from yesterday
> 4.9.0.0-snapshot – again no success. We do not have any issues with version
> 4.5.2 – but this installation is in a different datacentre.
> 
> Below you’ll find some logs – the relevant IP for this test connection is:
> *85.88.16.104*
> 
> CloudStack 4.8.0.1 Logs (Google Docs):
> 
> https://drive.google.com/open?id=1gqIjDdG1htps4p1t7m1uHSs7aNHplWp1Np83nH6e7zM
> 
> 
> IPsec Logs from the Virtual Router:
> https://drive.google.com/open?id=1ZWvhFu2P_Wv_lF8TgYMmexeS_KDag1Mp-kmuhl8l7uU
> 
> 
> Thank you in advance for your help!
> 
> Jonas
> 
> PS: If possible from your site we can do a remote session to take a look at
> the setup.




DISCLAIMER
==========
This e-mail may contain privileged and confidential information which is the property of Accelerite, a Persistent Systems business. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Accelerite, a Persistent Systems business does not accept any liability for virus infected mails.