You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@cloudstack.apache.org by Nux! <nu...@li.nux.ro> on 2015/06/01 13:45:03 UTC

Re: Regular total loss of connectivity

Thanks Simon,

link up/down has not helped, setting tso etc off on the link has not helped either.
Connectivity is lost as usual after ~4 hours.

I found some suggestions to try and use the e1000 nic instead of virtio, will do that.

Lucian

--
Sent from the Delta quadrant using Borg technology!

Nux!
www.nux.ro

----- Original Message -----
> From: "Simon Weller" <sw...@ena.com>
> To: "dev" <de...@cloudstack.apache.org>, "Cloudstack Users List" <us...@cloudstack.apache.org>
> Sent: Sunday, 31 May, 2015 22:36:56
> Subject: Re: Regular total loss of connectivity

> If you ifdown the interface on the router and then ifup it again, does the arp
> problem resolve itself?
> 
> We've seen a similar issue before caused by malicious/heavy traffic related to
> this bug:
> 
> https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/997978
> 
> - Si
> ________________________________________
> From: Nux! <nu...@li.nux.ro>
> Sent: Sunday, May 31, 2015 4:25 PM
> To: dev; Cloudstack Users List
> Subject: Regular total loss of connectivity
> 
> Hi,
> 
> Following a power cut, one of my cloudstack deployments is having a really weird
> problem that I cannot seem to solve on my own.
> Every 3 hours all the public IPs on the VR stop responding from the Internet.
> From the VR they are of course all reachable.
> In the same VLAN as the public IPs there is another physical server, this one
> can also access the VMs on their IPs just fine.
> 
> The provider has not found the problem and hints at problems with the cloud
> platform, however cloudstack worked just fine until the power cut, not to
> mention the problem persists through HV and ACS upgrades.
> 
> I'm thinking network side arp issues or something like this, alas I am not that
> good with network stuff and don't have access to it anyway.
> 
> If I reboot the VR once or twice the IPs start working again and the VMs are
> accessible from the internet.
> 
> Ideas?
> 
> Env: CentOS 6, KVM, ACS 44 to 4.5.1, Adv zone
> 
> Lucian
> 
> --
> Sent from the Delta quadrant using Borg technology!
> 
> Nux!
> www.nux.ro

Re: Regular total loss of connectivity

Posted by Nux! <nu...@li.nux.ro>.

Arp remains unchanged when it happens, I only have one host.
I'll try the ping thing next.

Thanks
Lucian

--
Sent from the Delta quadrant using Borg technology!

Nux!
www.nux.ro

----- Original Message -----
> From: "Simon Weller" <sw...@ena.com>
> To: users@cloudstack.apache.org, dev@cloudstack.apache.org
> Sent: Monday, 1 June, 2015 18:23:13
> Subject: Re: Regular total loss of connectivity

> What does arp show you on the VR when this occurs?
> Can you isolate this VR to a different physical host and upstream switch?
> If you leave a ping going to some external ip (e.g. 8.8.8.8) from the VR, do you
> still lose connectivity?
> 
> ________________________________________
> From: Nux! <nu...@li.nux.ro>
> Sent: Monday, June 1, 2015 12:15 PM
> To: users@cloudstack.apache.org
> Cc: dev@cloudstack.apache.org
> Subject: Re: Regular total loss of connectivity
> 
> Ok, no luck with e1000 either, connectivity is lost after 4 hours (14400 sec).
> That can't be random.
> 
> Any ideas?
> 
> --
> Sent from the Delta quadrant using Borg technology!
> 
> Nux!
> www.nux.ro
> 
> ----- Original Message -----
>> From: "Nux!" <nu...@li.nux.ro>
>> To: users@cloudstack.apache.org
>> Cc: dev@cloudstack.apache.org
>> Sent: Monday, 1 June, 2015 14:35:09
>> Subject: Re: Regular total loss of connectivity
> 
>> Nope, it's a regular, non-redundant VR.
>>
>> I've switched to using e1000 instead of virtio, waiting for a few hours, so how
>> it pans out. :-)
>>
>> --
>> Sent from the Delta quadrant using Borg technology!
>>
>> Nux!
>> www.nux.ro
>>
>> ----- Original Message -----
>>> From: "Simon Weller" <sw...@ena.com>
>>> To: dev@cloudstack.apache.org
>>> Cc: "Cloudstack Users List" <us...@cloudstack.apache.org>
>>> Sent: Monday, 1 June, 2015 13:32:26
>>> Subject: Re: Regular total loss of connectivity
>>
>>> Is this VR in a redundant pair? If so, does stopping the master and allowing the
>>> slave to take over allow the flow of traffic to resume?
>>>
>>>
>>> ________________________________________
>>> From: Nux! <nu...@li.nux.ro>
>>> Sent: Monday, June 1, 2015 6:45 AM
>>> To: dev@cloudstack.apache.org
>>> Cc: Cloudstack Users List
>>> Subject: Re: Regular total loss of connectivity
>>>
>>> Thanks Simon,
>>>
>>> link up/down has not helped, setting tso etc off on the link has not helped
>>> either.
>>> Connectivity is lost as usual after ~4 hours.
>>>
>>> I found some suggestions to try and use the e1000 nic instead of virtio, will do
>>> that.
>>>
>>> Lucian
>>>
>>> --
>>> Sent from the Delta quadrant using Borg technology!
>>>
>>> Nux!
>>> www.nux.ro
>>>
>>> ----- Original Message -----
>>>> From: "Simon Weller" <sw...@ena.com>
>>>> To: "dev" <de...@cloudstack.apache.org>, "Cloudstack Users List"
>>>> <us...@cloudstack.apache.org>
>>>> Sent: Sunday, 31 May, 2015 22:36:56
>>>> Subject: Re: Regular total loss of connectivity
>>>
>>>> If you ifdown the interface on the router and then ifup it again, does the arp
>>>> problem resolve itself?
>>>>
>>>> We've seen a similar issue before caused by malicious/heavy traffic related to
>>>> this bug:
>>>>
>>>> https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/997978
>>>>
>>>> - Si
>>>> ________________________________________
>>>> From: Nux! <nu...@li.nux.ro>
>>>> Sent: Sunday, May 31, 2015 4:25 PM
>>>> To: dev; Cloudstack Users List
>>>> Subject: Regular total loss of connectivity
>>>>
>>>> Hi,
>>>>
>>>> Following a power cut, one of my cloudstack deployments is having a really weird
>>>> problem that I cannot seem to solve on my own.
>>>> Every 3 hours all the public IPs on the VR stop responding from the Internet.
>>>> From the VR they are of course all reachable.
>>>> In the same VLAN as the public IPs there is another physical server, this one
>>>> can also access the VMs on their IPs just fine.
>>>>
>>>> The provider has not found the problem and hints at problems with the cloud
>>>> platform, however cloudstack worked just fine until the power cut, not to
>>>> mention the problem persists through HV and ACS upgrades.
>>>>
>>>> I'm thinking network side arp issues or something like this, alas I am not that
>>>> good with network stuff and don't have access to it anyway.
>>>>
>>>> If I reboot the VR once or twice the IPs start working again and the VMs are
>>>> accessible from the internet.
>>>>
>>>> Ideas?
>>>>
>>>> Env: CentOS 6, KVM, ACS 44 to 4.5.1, Adv zone
>>>>
>>>> Lucian
>>>>
>>>> --
>>>> Sent from the Delta quadrant using Borg technology!
>>>>
>>>> Nux!
> > > > www.nux.ro

Re: Regular total loss of connectivity

Posted by Nux! <nu...@li.nux.ro>.

Ok, managed to find the problem. It's a combination of settings on the provider cisco port and the VR.
I'm not sure how this kind of problems always find me, perhaps some gypsy curse. :-)

The public IPs I am using are statically assigned in the cisco vlan and some random private IP is assigned to this port so that it stays up.
Because this private IP is not part of the subnet I am using and also because arp_ignore is set to 2 in the VR, all connectivity is lost after 14400 seconds (default arp cache in cisco, I am told).

Problem can be fixed by either changing arp_ignore to 0 (what I did) or set a proper IP from the same subnet on the cisco port.

What makes me scratch my head is why this problem was triggered all of a sudden ... it worked fine for many months before the power cut.

Thanks

--
Sent from the Delta quadrant using Borg technology!

Nux!
www.nux.ro

----- Original Message -----
> From: "Nux!" <nu...@li.nux.ro>
> To: dev@cloudstack.apache.org
> Cc: users@cloudstack.apache.org
> Sent: Monday, 1 June, 2015 22:19:58
> Subject: Re: Regular total loss of connectivity

> I was using latest stock EL6 , but now am trying kernel 4.x and qemu-rhev from
> ovirt.org.
> 
> Thanks,
> Lucian
> 
> --
> Sent from the Delta quadrant using Borg technology!
> 
> Nux!
> www.nux.ro
> 
> ----- Original Message -----
>> From: "Rafael Fonseca" <rs...@gmail.com>
>> To: dev@cloudstack.apache.org
>> Cc: users@cloudstack.apache.org
>> Sent: Monday, 1 June, 2015 18:42:23
>> Subject: Re: Regular total loss of connectivity
> 
>> There are also a few kvm bugs which could be related, what kernel and
>> qemu-kvm versions are you running?
>> 
>> Rafael
>> 
>> On Mon, Jun 1, 2015 at 7:23 PM, Simon Weller <sw...@ena.com> wrote:
>> 
>>> What does arp show you on the VR when this occurs?
>>> Can you isolate this VR to a different physical host and upstream switch?
>>> If you leave a ping going to some external ip (e.g. 8.8.8.8) from the VR,
>>> do you still lose connectivity?
>>>
>>> ________________________________________
>>> From: Nux! <nu...@li.nux.ro>
>>> Sent: Monday, June 1, 2015 12:15 PM
>>> To: users@cloudstack.apache.org
>>> Cc: dev@cloudstack.apache.org
>>> Subject: Re: Regular total loss of connectivity
>>>
>>> Ok, no luck with e1000 either, connectivity is lost after 4 hours (14400
>>> sec). That can't be random.
>>>
>>> Any ideas?
>>>
>>> --
>>> Sent from the Delta quadrant using Borg technology!
>>>
>>> Nux!
>>> www.nux.ro
>>>
>>> ----- Original Message -----
>>> > From: "Nux!" <nu...@li.nux.ro>
>>> > To: users@cloudstack.apache.org
>>> > Cc: dev@cloudstack.apache.org
>>> > Sent: Monday, 1 June, 2015 14:35:09
>>> > Subject: Re: Regular total loss of connectivity
>>>
>>> > Nope, it's a regular, non-redundant VR.
>>> >
>>> > I've switched to using e1000 instead of virtio, waiting for a few hours,
>>> so how
>>> > it pans out. :-)
>>> >
>>> > --
>>> > Sent from the Delta quadrant using Borg technology!
>>> >
>>> > Nux!
>>> > www.nux.ro
>>> >
>>> > ----- Original Message -----
>>> >> From: "Simon Weller" <sw...@ena.com>
>>> >> To: dev@cloudstack.apache.org
>>> >> Cc: "Cloudstack Users List" <us...@cloudstack.apache.org>
>>> >> Sent: Monday, 1 June, 2015 13:32:26
>>> >> Subject: Re: Regular total loss of connectivity
>>> >
>>> >> Is this VR in a redundant pair? If so, does stopping the master and
>>> allowing the
>>> >> slave to take over allow the flow of traffic to resume?
>>> >>
>>> >>
>>> >> ________________________________________
>>> >> From: Nux! <nu...@li.nux.ro>
>>> >> Sent: Monday, June 1, 2015 6:45 AM
>>> >> To: dev@cloudstack.apache.org
>>> >> Cc: Cloudstack Users List
>>> >> Subject: Re: Regular total loss of connectivity
>>> >>
>>> >> Thanks Simon,
>>> >>
>>> >> link up/down has not helped, setting tso etc off on the link has not
>>> helped
>>> >> either.
>>> >> Connectivity is lost as usual after ~4 hours.
>>> >>
>>> >> I found some suggestions to try and use the e1000 nic instead of
>>> virtio, will do
>>> >> that.
>>> >>
>>> >> Lucian
>>> >>
>>> >> --
>>> >> Sent from the Delta quadrant using Borg technology!
>>> >>
>>> >> Nux!
>>> >> www.nux.ro
>>> >>
>>> >> ----- Original Message -----
>>> >>> From: "Simon Weller" <sw...@ena.com>
>>> >>> To: "dev" <de...@cloudstack.apache.org>, "Cloudstack Users List"
>>> >>> <us...@cloudstack.apache.org>
>>> >>> Sent: Sunday, 31 May, 2015 22:36:56
>>> >>> Subject: Re: Regular total loss of connectivity
>>> >>
>>> >>> If you ifdown the interface on the router and then ifup it again, does
>>> the arp
>>> >>> problem resolve itself?
>>> >>>
>>> >>> We've seen a similar issue before caused by malicious/heavy traffic
>>> related to
>>> >>> this bug:
>>> >>>
>>> >>> https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/997978
>>> >>>
>>> >>> - Si
>>> >>> ________________________________________
>>> >>> From: Nux! <nu...@li.nux.ro>
>>> >>> Sent: Sunday, May 31, 2015 4:25 PM
>>> >>> To: dev; Cloudstack Users List
>>> >>> Subject: Regular total loss of connectivity
>>> >>>
>>> >>> Hi,
>>> >>>
>>> >>> Following a power cut, one of my cloudstack deployments is having a
>>> really weird
>>> >>> problem that I cannot seem to solve on my own.
>>> >>> Every 3 hours all the public IPs on the VR stop responding from the
>>> Internet.
>>> >>> From the VR they are of course all reachable.
>>> >>> In the same VLAN as the public IPs there is another physical server,
>>> this one
>>> >>> can also access the VMs on their IPs just fine.
>>> >>>
>>> >>> The provider has not found the problem and hints at problems with the
>>> cloud
>>> >>> platform, however cloudstack worked just fine until the power cut, not
>>> to
>>> >>> mention the problem persists through HV and ACS upgrades.
>>> >>>
>>> >>> I'm thinking network side arp issues or something like this, alas I am
>>> not that
>>> >>> good with network stuff and don't have access to it anyway.
>>> >>>
>>> >>> If I reboot the VR once or twice the IPs start working again and the
>>> VMs are
>>> >>> accessible from the internet.
>>> >>>
>>> >>> Ideas?
>>> >>>
>>> >>> Env: CentOS 6, KVM, ACS 44 to 4.5.1, Adv zone
>>> >>>
>>> >>> Lucian
>>> >>>
>>> >>> --
>>> >>> Sent from the Delta quadrant using Borg technology!
>>> >>>
>>> >>> Nux!
> >> > > > www.nux.ro

Re: Regular total loss of connectivity

Posted by Nux! <nu...@li.nux.ro>.

Ok, managed to find the problem. It's a combination of settings on the provider cisco port and the VR.
I'm not sure how this kind of problems always find me, perhaps some gypsy curse. :-)

The public IPs I am using are statically assigned in the cisco vlan and some random private IP is assigned to this port so that it stays up.
Because this private IP is not part of the subnet I am using and also because arp_ignore is set to 2 in the VR, all connectivity is lost after 14400 seconds (default arp cache in cisco, I am told).

Problem can be fixed by either changing arp_ignore to 0 (what I did) or set a proper IP from the same subnet on the cisco port.

What makes me scratch my head is why this problem was triggered all of a sudden ... it worked fine for many months before the power cut.

Thanks

--
Sent from the Delta quadrant using Borg technology!

Nux!
www.nux.ro

----- Original Message -----
> From: "Nux!" <nu...@li.nux.ro>
> To: dev@cloudstack.apache.org
> Cc: users@cloudstack.apache.org
> Sent: Monday, 1 June, 2015 22:19:58
> Subject: Re: Regular total loss of connectivity

> I was using latest stock EL6 , but now am trying kernel 4.x and qemu-rhev from
> ovirt.org.
> 
> Thanks,
> Lucian
> 
> --
> Sent from the Delta quadrant using Borg technology!
> 
> Nux!
> www.nux.ro
> 
> ----- Original Message -----
>> From: "Rafael Fonseca" <rs...@gmail.com>
>> To: dev@cloudstack.apache.org
>> Cc: users@cloudstack.apache.org
>> Sent: Monday, 1 June, 2015 18:42:23
>> Subject: Re: Regular total loss of connectivity
> 
>> There are also a few kvm bugs which could be related, what kernel and
>> qemu-kvm versions are you running?
>> 
>> Rafael
>> 
>> On Mon, Jun 1, 2015 at 7:23 PM, Simon Weller <sw...@ena.com> wrote:
>> 
>>> What does arp show you on the VR when this occurs?
>>> Can you isolate this VR to a different physical host and upstream switch?
>>> If you leave a ping going to some external ip (e.g. 8.8.8.8) from the VR,
>>> do you still lose connectivity?
>>>
>>> ________________________________________
>>> From: Nux! <nu...@li.nux.ro>
>>> Sent: Monday, June 1, 2015 12:15 PM
>>> To: users@cloudstack.apache.org
>>> Cc: dev@cloudstack.apache.org
>>> Subject: Re: Regular total loss of connectivity
>>>
>>> Ok, no luck with e1000 either, connectivity is lost after 4 hours (14400
>>> sec). That can't be random.
>>>
>>> Any ideas?
>>>
>>> --
>>> Sent from the Delta quadrant using Borg technology!
>>>
>>> Nux!
>>> www.nux.ro
>>>
>>> ----- Original Message -----
>>> > From: "Nux!" <nu...@li.nux.ro>
>>> > To: users@cloudstack.apache.org
>>> > Cc: dev@cloudstack.apache.org
>>> > Sent: Monday, 1 June, 2015 14:35:09
>>> > Subject: Re: Regular total loss of connectivity
>>>
>>> > Nope, it's a regular, non-redundant VR.
>>> >
>>> > I've switched to using e1000 instead of virtio, waiting for a few hours,
>>> so how
>>> > it pans out. :-)
>>> >
>>> > --
>>> > Sent from the Delta quadrant using Borg technology!
>>> >
>>> > Nux!
>>> > www.nux.ro
>>> >
>>> > ----- Original Message -----
>>> >> From: "Simon Weller" <sw...@ena.com>
>>> >> To: dev@cloudstack.apache.org
>>> >> Cc: "Cloudstack Users List" <us...@cloudstack.apache.org>
>>> >> Sent: Monday, 1 June, 2015 13:32:26
>>> >> Subject: Re: Regular total loss of connectivity
>>> >
>>> >> Is this VR in a redundant pair? If so, does stopping the master and
>>> allowing the
>>> >> slave to take over allow the flow of traffic to resume?
>>> >>
>>> >>
>>> >> ________________________________________
>>> >> From: Nux! <nu...@li.nux.ro>
>>> >> Sent: Monday, June 1, 2015 6:45 AM
>>> >> To: dev@cloudstack.apache.org
>>> >> Cc: Cloudstack Users List
>>> >> Subject: Re: Regular total loss of connectivity
>>> >>
>>> >> Thanks Simon,
>>> >>
>>> >> link up/down has not helped, setting tso etc off on the link has not
>>> helped
>>> >> either.
>>> >> Connectivity is lost as usual after ~4 hours.
>>> >>
>>> >> I found some suggestions to try and use the e1000 nic instead of
>>> virtio, will do
>>> >> that.
>>> >>
>>> >> Lucian
>>> >>
>>> >> --
>>> >> Sent from the Delta quadrant using Borg technology!
>>> >>
>>> >> Nux!
>>> >> www.nux.ro
>>> >>
>>> >> ----- Original Message -----
>>> >>> From: "Simon Weller" <sw...@ena.com>
>>> >>> To: "dev" <de...@cloudstack.apache.org>, "Cloudstack Users List"
>>> >>> <us...@cloudstack.apache.org>
>>> >>> Sent: Sunday, 31 May, 2015 22:36:56
>>> >>> Subject: Re: Regular total loss of connectivity
>>> >>
>>> >>> If you ifdown the interface on the router and then ifup it again, does
>>> the arp
>>> >>> problem resolve itself?
>>> >>>
>>> >>> We've seen a similar issue before caused by malicious/heavy traffic
>>> related to
>>> >>> this bug:
>>> >>>
>>> >>> https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/997978
>>> >>>
>>> >>> - Si
>>> >>> ________________________________________
>>> >>> From: Nux! <nu...@li.nux.ro>
>>> >>> Sent: Sunday, May 31, 2015 4:25 PM
>>> >>> To: dev; Cloudstack Users List
>>> >>> Subject: Regular total loss of connectivity
>>> >>>
>>> >>> Hi,
>>> >>>
>>> >>> Following a power cut, one of my cloudstack deployments is having a
>>> really weird
>>> >>> problem that I cannot seem to solve on my own.
>>> >>> Every 3 hours all the public IPs on the VR stop responding from the
>>> Internet.
>>> >>> From the VR they are of course all reachable.
>>> >>> In the same VLAN as the public IPs there is another physical server,
>>> this one
>>> >>> can also access the VMs on their IPs just fine.
>>> >>>
>>> >>> The provider has not found the problem and hints at problems with the
>>> cloud
>>> >>> platform, however cloudstack worked just fine until the power cut, not
>>> to
>>> >>> mention the problem persists through HV and ACS upgrades.
>>> >>>
>>> >>> I'm thinking network side arp issues or something like this, alas I am
>>> not that
>>> >>> good with network stuff and don't have access to it anyway.
>>> >>>
>>> >>> If I reboot the VR once or twice the IPs start working again and the
>>> VMs are
>>> >>> accessible from the internet.
>>> >>>
>>> >>> Ideas?
>>> >>>
>>> >>> Env: CentOS 6, KVM, ACS 44 to 4.5.1, Adv zone
>>> >>>
>>> >>> Lucian
>>> >>>
>>> >>> --
>>> >>> Sent from the Delta quadrant using Borg technology!
>>> >>>
>>> >>> Nux!
> >> > > > www.nux.ro

Re: Regular total loss of connectivity

Posted by Nux! <nu...@li.nux.ro>.

I was using latest stock EL6 , but now am trying kernel 4.x and qemu-rhev from ovirt.org.

Thanks,
Lucian

--
Sent from the Delta quadrant using Borg technology!

Nux!
www.nux.ro

----- Original Message -----
> From: "Rafael Fonseca" <rs...@gmail.com>
> To: dev@cloudstack.apache.org
> Cc: users@cloudstack.apache.org
> Sent: Monday, 1 June, 2015 18:42:23
> Subject: Re: Regular total loss of connectivity

> There are also a few kvm bugs which could be related, what kernel and
> qemu-kvm versions are you running?
> 
> Rafael
> 
> On Mon, Jun 1, 2015 at 7:23 PM, Simon Weller <sw...@ena.com> wrote:
> 
>> What does arp show you on the VR when this occurs?
>> Can you isolate this VR to a different physical host and upstream switch?
>> If you leave a ping going to some external ip (e.g. 8.8.8.8) from the VR,
>> do you still lose connectivity?
>>
>> ________________________________________
>> From: Nux! <nu...@li.nux.ro>
>> Sent: Monday, June 1, 2015 12:15 PM
>> To: users@cloudstack.apache.org
>> Cc: dev@cloudstack.apache.org
>> Subject: Re: Regular total loss of connectivity
>>
>> Ok, no luck with e1000 either, connectivity is lost after 4 hours (14400
>> sec). That can't be random.
>>
>> Any ideas?
>>
>> --
>> Sent from the Delta quadrant using Borg technology!
>>
>> Nux!
>> www.nux.ro
>>
>> ----- Original Message -----
>> > From: "Nux!" <nu...@li.nux.ro>
>> > To: users@cloudstack.apache.org
>> > Cc: dev@cloudstack.apache.org
>> > Sent: Monday, 1 June, 2015 14:35:09
>> > Subject: Re: Regular total loss of connectivity
>>
>> > Nope, it's a regular, non-redundant VR.
>> >
>> > I've switched to using e1000 instead of virtio, waiting for a few hours,
>> so how
>> > it pans out. :-)
>> >
>> > --
>> > Sent from the Delta quadrant using Borg technology!
>> >
>> > Nux!
>> > www.nux.ro
>> >
>> > ----- Original Message -----
>> >> From: "Simon Weller" <sw...@ena.com>
>> >> To: dev@cloudstack.apache.org
>> >> Cc: "Cloudstack Users List" <us...@cloudstack.apache.org>
>> >> Sent: Monday, 1 June, 2015 13:32:26
>> >> Subject: Re: Regular total loss of connectivity
>> >
>> >> Is this VR in a redundant pair? If so, does stopping the master and
>> allowing the
>> >> slave to take over allow the flow of traffic to resume?
>> >>
>> >>
>> >> ________________________________________
>> >> From: Nux! <nu...@li.nux.ro>
>> >> Sent: Monday, June 1, 2015 6:45 AM
>> >> To: dev@cloudstack.apache.org
>> >> Cc: Cloudstack Users List
>> >> Subject: Re: Regular total loss of connectivity
>> >>
>> >> Thanks Simon,
>> >>
>> >> link up/down has not helped, setting tso etc off on the link has not
>> helped
>> >> either.
>> >> Connectivity is lost as usual after ~4 hours.
>> >>
>> >> I found some suggestions to try and use the e1000 nic instead of
>> virtio, will do
>> >> that.
>> >>
>> >> Lucian
>> >>
>> >> --
>> >> Sent from the Delta quadrant using Borg technology!
>> >>
>> >> Nux!
>> >> www.nux.ro
>> >>
>> >> ----- Original Message -----
>> >>> From: "Simon Weller" <sw...@ena.com>
>> >>> To: "dev" <de...@cloudstack.apache.org>, "Cloudstack Users List"
>> >>> <us...@cloudstack.apache.org>
>> >>> Sent: Sunday, 31 May, 2015 22:36:56
>> >>> Subject: Re: Regular total loss of connectivity
>> >>
>> >>> If you ifdown the interface on the router and then ifup it again, does
>> the arp
>> >>> problem resolve itself?
>> >>>
>> >>> We've seen a similar issue before caused by malicious/heavy traffic
>> related to
>> >>> this bug:
>> >>>
>> >>> https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/997978
>> >>>
>> >>> - Si
>> >>> ________________________________________
>> >>> From: Nux! <nu...@li.nux.ro>
>> >>> Sent: Sunday, May 31, 2015 4:25 PM
>> >>> To: dev; Cloudstack Users List
>> >>> Subject: Regular total loss of connectivity
>> >>>
>> >>> Hi,
>> >>>
>> >>> Following a power cut, one of my cloudstack deployments is having a
>> really weird
>> >>> problem that I cannot seem to solve on my own.
>> >>> Every 3 hours all the public IPs on the VR stop responding from the
>> Internet.
>> >>> From the VR they are of course all reachable.
>> >>> In the same VLAN as the public IPs there is another physical server,
>> this one
>> >>> can also access the VMs on their IPs just fine.
>> >>>
>> >>> The provider has not found the problem and hints at problems with the
>> cloud
>> >>> platform, however cloudstack worked just fine until the power cut, not
>> to
>> >>> mention the problem persists through HV and ACS upgrades.
>> >>>
>> >>> I'm thinking network side arp issues or something like this, alas I am
>> not that
>> >>> good with network stuff and don't have access to it anyway.
>> >>>
>> >>> If I reboot the VR once or twice the IPs start working again and the
>> VMs are
>> >>> accessible from the internet.
>> >>>
>> >>> Ideas?
>> >>>
>> >>> Env: CentOS 6, KVM, ACS 44 to 4.5.1, Adv zone
>> >>>
>> >>> Lucian
>> >>>
>> >>> --
>> >>> Sent from the Delta quadrant using Borg technology!
>> >>>
>> >>> Nux!
>> > > > www.nux.ro

Re: Regular total loss of connectivity

Posted by Nux! <nu...@li.nux.ro>.

I was using latest stock EL6 , but now am trying kernel 4.x and qemu-rhev from ovirt.org.

Thanks,
Lucian

--
Sent from the Delta quadrant using Borg technology!

Nux!
www.nux.ro

----- Original Message -----
> From: "Rafael Fonseca" <rs...@gmail.com>
> To: dev@cloudstack.apache.org
> Cc: users@cloudstack.apache.org
> Sent: Monday, 1 June, 2015 18:42:23
> Subject: Re: Regular total loss of connectivity

> There are also a few kvm bugs which could be related, what kernel and
> qemu-kvm versions are you running?
> 
> Rafael
> 
> On Mon, Jun 1, 2015 at 7:23 PM, Simon Weller <sw...@ena.com> wrote:
> 
>> What does arp show you on the VR when this occurs?
>> Can you isolate this VR to a different physical host and upstream switch?
>> If you leave a ping going to some external ip (e.g. 8.8.8.8) from the VR,
>> do you still lose connectivity?
>>
>> ________________________________________
>> From: Nux! <nu...@li.nux.ro>
>> Sent: Monday, June 1, 2015 12:15 PM
>> To: users@cloudstack.apache.org
>> Cc: dev@cloudstack.apache.org
>> Subject: Re: Regular total loss of connectivity
>>
>> Ok, no luck with e1000 either, connectivity is lost after 4 hours (14400
>> sec). That can't be random.
>>
>> Any ideas?
>>
>> --
>> Sent from the Delta quadrant using Borg technology!
>>
>> Nux!
>> www.nux.ro
>>
>> ----- Original Message -----
>> > From: "Nux!" <nu...@li.nux.ro>
>> > To: users@cloudstack.apache.org
>> > Cc: dev@cloudstack.apache.org
>> > Sent: Monday, 1 June, 2015 14:35:09
>> > Subject: Re: Regular total loss of connectivity
>>
>> > Nope, it's a regular, non-redundant VR.
>> >
>> > I've switched to using e1000 instead of virtio, waiting for a few hours,
>> so how
>> > it pans out. :-)
>> >
>> > --
>> > Sent from the Delta quadrant using Borg technology!
>> >
>> > Nux!
>> > www.nux.ro
>> >
>> > ----- Original Message -----
>> >> From: "Simon Weller" <sw...@ena.com>
>> >> To: dev@cloudstack.apache.org
>> >> Cc: "Cloudstack Users List" <us...@cloudstack.apache.org>
>> >> Sent: Monday, 1 June, 2015 13:32:26
>> >> Subject: Re: Regular total loss of connectivity
>> >
>> >> Is this VR in a redundant pair? If so, does stopping the master and
>> allowing the
>> >> slave to take over allow the flow of traffic to resume?
>> >>
>> >>
>> >> ________________________________________
>> >> From: Nux! <nu...@li.nux.ro>
>> >> Sent: Monday, June 1, 2015 6:45 AM
>> >> To: dev@cloudstack.apache.org
>> >> Cc: Cloudstack Users List
>> >> Subject: Re: Regular total loss of connectivity
>> >>
>> >> Thanks Simon,
>> >>
>> >> link up/down has not helped, setting tso etc off on the link has not
>> helped
>> >> either.
>> >> Connectivity is lost as usual after ~4 hours.
>> >>
>> >> I found some suggestions to try and use the e1000 nic instead of
>> virtio, will do
>> >> that.
>> >>
>> >> Lucian
>> >>
>> >> --
>> >> Sent from the Delta quadrant using Borg technology!
>> >>
>> >> Nux!
>> >> www.nux.ro
>> >>
>> >> ----- Original Message -----
>> >>> From: "Simon Weller" <sw...@ena.com>
>> >>> To: "dev" <de...@cloudstack.apache.org>, "Cloudstack Users List"
>> >>> <us...@cloudstack.apache.org>
>> >>> Sent: Sunday, 31 May, 2015 22:36:56
>> >>> Subject: Re: Regular total loss of connectivity
>> >>
>> >>> If you ifdown the interface on the router and then ifup it again, does
>> the arp
>> >>> problem resolve itself?
>> >>>
>> >>> We've seen a similar issue before caused by malicious/heavy traffic
>> related to
>> >>> this bug:
>> >>>
>> >>> https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/997978
>> >>>
>> >>> - Si
>> >>> ________________________________________
>> >>> From: Nux! <nu...@li.nux.ro>
>> >>> Sent: Sunday, May 31, 2015 4:25 PM
>> >>> To: dev; Cloudstack Users List
>> >>> Subject: Regular total loss of connectivity
>> >>>
>> >>> Hi,
>> >>>
>> >>> Following a power cut, one of my cloudstack deployments is having a
>> really weird
>> >>> problem that I cannot seem to solve on my own.
>> >>> Every 3 hours all the public IPs on the VR stop responding from the
>> Internet.
>> >>> From the VR they are of course all reachable.
>> >>> In the same VLAN as the public IPs there is another physical server,
>> this one
>> >>> can also access the VMs on their IPs just fine.
>> >>>
>> >>> The provider has not found the problem and hints at problems with the
>> cloud
>> >>> platform, however cloudstack worked just fine until the power cut, not
>> to
>> >>> mention the problem persists through HV and ACS upgrades.
>> >>>
>> >>> I'm thinking network side arp issues or something like this, alas I am
>> not that
>> >>> good with network stuff and don't have access to it anyway.
>> >>>
>> >>> If I reboot the VR once or twice the IPs start working again and the
>> VMs are
>> >>> accessible from the internet.
>> >>>
>> >>> Ideas?
>> >>>
>> >>> Env: CentOS 6, KVM, ACS 44 to 4.5.1, Adv zone
>> >>>
>> >>> Lucian
>> >>>
>> >>> --
>> >>> Sent from the Delta quadrant using Borg technology!
>> >>>
>> >>> Nux!
>> > > > www.nux.ro

Re: Regular total loss of connectivity

Posted by Rafael Fonseca <rs...@gmail.com>.

There are also a few kvm bugs which could be related, what kernel and
qemu-kvm versions are you running?

Rafael

On Mon, Jun 1, 2015 at 7:23 PM, Simon Weller <sw...@ena.com> wrote:

> What does arp show you on the VR when this occurs?
> Can you isolate this VR to a different physical host and upstream switch?
> If you leave a ping going to some external ip (e.g. 8.8.8.8) from the VR,
> do you still lose connectivity?
>
> ________________________________________
> From: Nux! <nu...@li.nux.ro>
> Sent: Monday, June 1, 2015 12:15 PM
> To: users@cloudstack.apache.org
> Cc: dev@cloudstack.apache.org
> Subject: Re: Regular total loss of connectivity
>
> Ok, no luck with e1000 either, connectivity is lost after 4 hours (14400
> sec). That can't be random.
>
> Any ideas?
>
> --
> Sent from the Delta quadrant using Borg technology!
>
> Nux!
> www.nux.ro
>
> ----- Original Message -----
> > From: "Nux!" <nu...@li.nux.ro>
> > To: users@cloudstack.apache.org
> > Cc: dev@cloudstack.apache.org
> > Sent: Monday, 1 June, 2015 14:35:09
> > Subject: Re: Regular total loss of connectivity
>
> > Nope, it's a regular, non-redundant VR.
> >
> > I've switched to using e1000 instead of virtio, waiting for a few hours,
> so how
> > it pans out. :-)
> >
> > --
> > Sent from the Delta quadrant using Borg technology!
> >
> > Nux!
> > www.nux.ro
> >
> > ----- Original Message -----
> >> From: "Simon Weller" <sw...@ena.com>
> >> To: dev@cloudstack.apache.org
> >> Cc: "Cloudstack Users List" <us...@cloudstack.apache.org>
> >> Sent: Monday, 1 June, 2015 13:32:26
> >> Subject: Re: Regular total loss of connectivity
> >
> >> Is this VR in a redundant pair? If so, does stopping the master and
> allowing the
> >> slave to take over allow the flow of traffic to resume?
> >>
> >>
> >> ________________________________________
> >> From: Nux! <nu...@li.nux.ro>
> >> Sent: Monday, June 1, 2015 6:45 AM
> >> To: dev@cloudstack.apache.org
> >> Cc: Cloudstack Users List
> >> Subject: Re: Regular total loss of connectivity
> >>
> >> Thanks Simon,
> >>
> >> link up/down has not helped, setting tso etc off on the link has not
> helped
> >> either.
> >> Connectivity is lost as usual after ~4 hours.
> >>
> >> I found some suggestions to try and use the e1000 nic instead of
> virtio, will do
> >> that.
> >>
> >> Lucian
> >>
> >> --
> >> Sent from the Delta quadrant using Borg technology!
> >>
> >> Nux!
> >> www.nux.ro
> >>
> >> ----- Original Message -----
> >>> From: "Simon Weller" <sw...@ena.com>
> >>> To: "dev" <de...@cloudstack.apache.org>, "Cloudstack Users List"
> >>> <us...@cloudstack.apache.org>
> >>> Sent: Sunday, 31 May, 2015 22:36:56
> >>> Subject: Re: Regular total loss of connectivity
> >>
> >>> If you ifdown the interface on the router and then ifup it again, does
> the arp
> >>> problem resolve itself?
> >>>
> >>> We've seen a similar issue before caused by malicious/heavy traffic
> related to
> >>> this bug:
> >>>
> >>> https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/997978
> >>>
> >>> - Si
> >>> ________________________________________
> >>> From: Nux! <nu...@li.nux.ro>
> >>> Sent: Sunday, May 31, 2015 4:25 PM
> >>> To: dev; Cloudstack Users List
> >>> Subject: Regular total loss of connectivity
> >>>
> >>> Hi,
> >>>
> >>> Following a power cut, one of my cloudstack deployments is having a
> really weird
> >>> problem that I cannot seem to solve on my own.
> >>> Every 3 hours all the public IPs on the VR stop responding from the
> Internet.
> >>> From the VR they are of course all reachable.
> >>> In the same VLAN as the public IPs there is another physical server,
> this one
> >>> can also access the VMs on their IPs just fine.
> >>>
> >>> The provider has not found the problem and hints at problems with the
> cloud
> >>> platform, however cloudstack worked just fine until the power cut, not
> to
> >>> mention the problem persists through HV and ACS upgrades.
> >>>
> >>> I'm thinking network side arp issues or something like this, alas I am
> not that
> >>> good with network stuff and don't have access to it anyway.
> >>>
> >>> If I reboot the VR once or twice the IPs start working again and the
> VMs are
> >>> accessible from the internet.
> >>>
> >>> Ideas?
> >>>
> >>> Env: CentOS 6, KVM, ACS 44 to 4.5.1, Adv zone
> >>>
> >>> Lucian
> >>>
> >>> --
> >>> Sent from the Delta quadrant using Borg technology!
> >>>
> >>> Nux!
> > > > www.nux.ro
>

Re: Regular total loss of connectivity

Posted by Rafael Fonseca <rs...@gmail.com>.

There are also a few kvm bugs which could be related, what kernel and
qemu-kvm versions are you running?

Rafael

On Mon, Jun 1, 2015 at 7:23 PM, Simon Weller <sw...@ena.com> wrote:

> What does arp show you on the VR when this occurs?
> Can you isolate this VR to a different physical host and upstream switch?
> If you leave a ping going to some external ip (e.g. 8.8.8.8) from the VR,
> do you still lose connectivity?
>
> ________________________________________
> From: Nux! <nu...@li.nux.ro>
> Sent: Monday, June 1, 2015 12:15 PM
> To: users@cloudstack.apache.org
> Cc: dev@cloudstack.apache.org
> Subject: Re: Regular total loss of connectivity
>
> Ok, no luck with e1000 either, connectivity is lost after 4 hours (14400
> sec). That can't be random.
>
> Any ideas?
>
> --
> Sent from the Delta quadrant using Borg technology!
>
> Nux!
> www.nux.ro
>
> ----- Original Message -----
> > From: "Nux!" <nu...@li.nux.ro>
> > To: users@cloudstack.apache.org
> > Cc: dev@cloudstack.apache.org
> > Sent: Monday, 1 June, 2015 14:35:09
> > Subject: Re: Regular total loss of connectivity
>
> > Nope, it's a regular, non-redundant VR.
> >
> > I've switched to using e1000 instead of virtio, waiting for a few hours,
> so how
> > it pans out. :-)
> >
> > --
> > Sent from the Delta quadrant using Borg technology!
> >
> > Nux!
> > www.nux.ro
> >
> > ----- Original Message -----
> >> From: "Simon Weller" <sw...@ena.com>
> >> To: dev@cloudstack.apache.org
> >> Cc: "Cloudstack Users List" <us...@cloudstack.apache.org>
> >> Sent: Monday, 1 June, 2015 13:32:26
> >> Subject: Re: Regular total loss of connectivity
> >
> >> Is this VR in a redundant pair? If so, does stopping the master and
> allowing the
> >> slave to take over allow the flow of traffic to resume?
> >>
> >>
> >> ________________________________________
> >> From: Nux! <nu...@li.nux.ro>
> >> Sent: Monday, June 1, 2015 6:45 AM
> >> To: dev@cloudstack.apache.org
> >> Cc: Cloudstack Users List
> >> Subject: Re: Regular total loss of connectivity
> >>
> >> Thanks Simon,
> >>
> >> link up/down has not helped, setting tso etc off on the link has not
> helped
> >> either.
> >> Connectivity is lost as usual after ~4 hours.
> >>
> >> I found some suggestions to try and use the e1000 nic instead of
> virtio, will do
> >> that.
> >>
> >> Lucian
> >>
> >> --
> >> Sent from the Delta quadrant using Borg technology!
> >>
> >> Nux!
> >> www.nux.ro
> >>
> >> ----- Original Message -----
> >>> From: "Simon Weller" <sw...@ena.com>
> >>> To: "dev" <de...@cloudstack.apache.org>, "Cloudstack Users List"
> >>> <us...@cloudstack.apache.org>
> >>> Sent: Sunday, 31 May, 2015 22:36:56
> >>> Subject: Re: Regular total loss of connectivity
> >>
> >>> If you ifdown the interface on the router and then ifup it again, does
> the arp
> >>> problem resolve itself?
> >>>
> >>> We've seen a similar issue before caused by malicious/heavy traffic
> related to
> >>> this bug:
> >>>
> >>> https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/997978
> >>>
> >>> - Si
> >>> ________________________________________
> >>> From: Nux! <nu...@li.nux.ro>
> >>> Sent: Sunday, May 31, 2015 4:25 PM
> >>> To: dev; Cloudstack Users List
> >>> Subject: Regular total loss of connectivity
> >>>
> >>> Hi,
> >>>
> >>> Following a power cut, one of my cloudstack deployments is having a
> really weird
> >>> problem that I cannot seem to solve on my own.
> >>> Every 3 hours all the public IPs on the VR stop responding from the
> Internet.
> >>> From the VR they are of course all reachable.
> >>> In the same VLAN as the public IPs there is another physical server,
> this one
> >>> can also access the VMs on their IPs just fine.
> >>>
> >>> The provider has not found the problem and hints at problems with the
> cloud
> >>> platform, however cloudstack worked just fine until the power cut, not
> to
> >>> mention the problem persists through HV and ACS upgrades.
> >>>
> >>> I'm thinking network side arp issues or something like this, alas I am
> not that
> >>> good with network stuff and don't have access to it anyway.
> >>>
> >>> If I reboot the VR once or twice the IPs start working again and the
> VMs are
> >>> accessible from the internet.
> >>>
> >>> Ideas?
> >>>
> >>> Env: CentOS 6, KVM, ACS 44 to 4.5.1, Adv zone
> >>>
> >>> Lucian
> >>>
> >>> --
> >>> Sent from the Delta quadrant using Borg technology!
> >>>
> >>> Nux!
> > > > www.nux.ro
>

Re: Regular total loss of connectivity

Posted by Nux! <nu...@li.nux.ro>.

Arp remains unchanged when it happens, I only have one host.
I'll try the ping thing next.

Thanks
Lucian

--
Sent from the Delta quadrant using Borg technology!

Nux!
www.nux.ro

----- Original Message -----
> From: "Simon Weller" <sw...@ena.com>
> To: users@cloudstack.apache.org, dev@cloudstack.apache.org
> Sent: Monday, 1 June, 2015 18:23:13
> Subject: Re: Regular total loss of connectivity

> What does arp show you on the VR when this occurs?
> Can you isolate this VR to a different physical host and upstream switch?
> If you leave a ping going to some external ip (e.g. 8.8.8.8) from the VR, do you
> still lose connectivity?
> 
> ________________________________________
> From: Nux! <nu...@li.nux.ro>
> Sent: Monday, June 1, 2015 12:15 PM
> To: users@cloudstack.apache.org
> Cc: dev@cloudstack.apache.org
> Subject: Re: Regular total loss of connectivity
> 
> Ok, no luck with e1000 either, connectivity is lost after 4 hours (14400 sec).
> That can't be random.
> 
> Any ideas?
> 
> --
> Sent from the Delta quadrant using Borg technology!
> 
> Nux!
> www.nux.ro
> 
> ----- Original Message -----
>> From: "Nux!" <nu...@li.nux.ro>
>> To: users@cloudstack.apache.org
>> Cc: dev@cloudstack.apache.org
>> Sent: Monday, 1 June, 2015 14:35:09
>> Subject: Re: Regular total loss of connectivity
> 
>> Nope, it's a regular, non-redundant VR.
>>
>> I've switched to using e1000 instead of virtio, waiting for a few hours, so how
>> it pans out. :-)
>>
>> --
>> Sent from the Delta quadrant using Borg technology!
>>
>> Nux!
>> www.nux.ro
>>
>> ----- Original Message -----
>>> From: "Simon Weller" <sw...@ena.com>
>>> To: dev@cloudstack.apache.org
>>> Cc: "Cloudstack Users List" <us...@cloudstack.apache.org>
>>> Sent: Monday, 1 June, 2015 13:32:26
>>> Subject: Re: Regular total loss of connectivity
>>
>>> Is this VR in a redundant pair? If so, does stopping the master and allowing the
>>> slave to take over allow the flow of traffic to resume?
>>>
>>>
>>> ________________________________________
>>> From: Nux! <nu...@li.nux.ro>
>>> Sent: Monday, June 1, 2015 6:45 AM
>>> To: dev@cloudstack.apache.org
>>> Cc: Cloudstack Users List
>>> Subject: Re: Regular total loss of connectivity
>>>
>>> Thanks Simon,
>>>
>>> link up/down has not helped, setting tso etc off on the link has not helped
>>> either.
>>> Connectivity is lost as usual after ~4 hours.
>>>
>>> I found some suggestions to try and use the e1000 nic instead of virtio, will do
>>> that.
>>>
>>> Lucian
>>>
>>> --
>>> Sent from the Delta quadrant using Borg technology!
>>>
>>> Nux!
>>> www.nux.ro
>>>
>>> ----- Original Message -----
>>>> From: "Simon Weller" <sw...@ena.com>
>>>> To: "dev" <de...@cloudstack.apache.org>, "Cloudstack Users List"
>>>> <us...@cloudstack.apache.org>
>>>> Sent: Sunday, 31 May, 2015 22:36:56
>>>> Subject: Re: Regular total loss of connectivity
>>>
>>>> If you ifdown the interface on the router and then ifup it again, does the arp
>>>> problem resolve itself?
>>>>
>>>> We've seen a similar issue before caused by malicious/heavy traffic related to
>>>> this bug:
>>>>
>>>> https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/997978
>>>>
>>>> - Si
>>>> ________________________________________
>>>> From: Nux! <nu...@li.nux.ro>
>>>> Sent: Sunday, May 31, 2015 4:25 PM
>>>> To: dev; Cloudstack Users List
>>>> Subject: Regular total loss of connectivity
>>>>
>>>> Hi,
>>>>
>>>> Following a power cut, one of my cloudstack deployments is having a really weird
>>>> problem that I cannot seem to solve on my own.
>>>> Every 3 hours all the public IPs on the VR stop responding from the Internet.
>>>> From the VR they are of course all reachable.
>>>> In the same VLAN as the public IPs there is another physical server, this one
>>>> can also access the VMs on their IPs just fine.
>>>>
>>>> The provider has not found the problem and hints at problems with the cloud
>>>> platform, however cloudstack worked just fine until the power cut, not to
>>>> mention the problem persists through HV and ACS upgrades.
>>>>
>>>> I'm thinking network side arp issues or something like this, alas I am not that
>>>> good with network stuff and don't have access to it anyway.
>>>>
>>>> If I reboot the VR once or twice the IPs start working again and the VMs are
>>>> accessible from the internet.
>>>>
>>>> Ideas?
>>>>
>>>> Env: CentOS 6, KVM, ACS 44 to 4.5.1, Adv zone
>>>>
>>>> Lucian
>>>>
>>>> --
>>>> Sent from the Delta quadrant using Borg technology!
>>>>
>>>> Nux!
> > > > www.nux.ro

Re: Regular total loss of connectivity

Posted by Simon Weller <sw...@ena.com>.

What does arp show you on the VR when this occurs?
Can you isolate this VR to a different physical host and upstream switch?
If you leave a ping going to some external ip (e.g. 8.8.8.8) from the VR, do you still lose connectivity?

________________________________________
From: Nux! <nu...@li.nux.ro>
Sent: Monday, June 1, 2015 12:15 PM
To: users@cloudstack.apache.org
Cc: dev@cloudstack.apache.org
Subject: Re: Regular total loss of connectivity

Ok, no luck with e1000 either, connectivity is lost after 4 hours (14400 sec). That can't be random.

Any ideas?

--
Sent from the Delta quadrant using Borg technology!

Nux!
www.nux.ro

----- Original Message -----
> From: "Nux!" <nu...@li.nux.ro>
> To: users@cloudstack.apache.org
> Cc: dev@cloudstack.apache.org
> Sent: Monday, 1 June, 2015 14:35:09
> Subject: Re: Regular total loss of connectivity

> Nope, it's a regular, non-redundant VR.
>
> I've switched to using e1000 instead of virtio, waiting for a few hours, so how
> it pans out. :-)
>
> --
> Sent from the Delta quadrant using Borg technology!
>
> Nux!
> www.nux.ro
>
> ----- Original Message -----
>> From: "Simon Weller" <sw...@ena.com>
>> To: dev@cloudstack.apache.org
>> Cc: "Cloudstack Users List" <us...@cloudstack.apache.org>
>> Sent: Monday, 1 June, 2015 13:32:26
>> Subject: Re: Regular total loss of connectivity
>
>> Is this VR in a redundant pair? If so, does stopping the master and allowing the
>> slave to take over allow the flow of traffic to resume?
>>
>>
>> ________________________________________
>> From: Nux! <nu...@li.nux.ro>
>> Sent: Monday, June 1, 2015 6:45 AM
>> To: dev@cloudstack.apache.org
>> Cc: Cloudstack Users List
>> Subject: Re: Regular total loss of connectivity
>>
>> Thanks Simon,
>>
>> link up/down has not helped, setting tso etc off on the link has not helped
>> either.
>> Connectivity is lost as usual after ~4 hours.
>>
>> I found some suggestions to try and use the e1000 nic instead of virtio, will do
>> that.
>>
>> Lucian
>>
>> --
>> Sent from the Delta quadrant using Borg technology!
>>
>> Nux!
>> www.nux.ro
>>
>> ----- Original Message -----
>>> From: "Simon Weller" <sw...@ena.com>
>>> To: "dev" <de...@cloudstack.apache.org>, "Cloudstack Users List"
>>> <us...@cloudstack.apache.org>
>>> Sent: Sunday, 31 May, 2015 22:36:56
>>> Subject: Re: Regular total loss of connectivity
>>
>>> If you ifdown the interface on the router and then ifup it again, does the arp
>>> problem resolve itself?
>>>
>>> We've seen a similar issue before caused by malicious/heavy traffic related to
>>> this bug:
>>>
>>> https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/997978
>>>
>>> - Si
>>> ________________________________________
>>> From: Nux! <nu...@li.nux.ro>
>>> Sent: Sunday, May 31, 2015 4:25 PM
>>> To: dev; Cloudstack Users List
>>> Subject: Regular total loss of connectivity
>>>
>>> Hi,
>>>
>>> Following a power cut, one of my cloudstack deployments is having a really weird
>>> problem that I cannot seem to solve on my own.
>>> Every 3 hours all the public IPs on the VR stop responding from the Internet.
>>> From the VR they are of course all reachable.
>>> In the same VLAN as the public IPs there is another physical server, this one
>>> can also access the VMs on their IPs just fine.
>>>
>>> The provider has not found the problem and hints at problems with the cloud
>>> platform, however cloudstack worked just fine until the power cut, not to
>>> mention the problem persists through HV and ACS upgrades.
>>>
>>> I'm thinking network side arp issues or something like this, alas I am not that
>>> good with network stuff and don't have access to it anyway.
>>>
>>> If I reboot the VR once or twice the IPs start working again and the VMs are
>>> accessible from the internet.
>>>
>>> Ideas?
>>>
>>> Env: CentOS 6, KVM, ACS 44 to 4.5.1, Adv zone
>>>
>>> Lucian
>>>
>>> --
>>> Sent from the Delta quadrant using Borg technology!
>>>
>>> Nux!
> > > www.nux.ro

Re: Regular total loss of connectivity

Posted by Simon Weller <sw...@ena.com>.

What does arp show you on the VR when this occurs?
Can you isolate this VR to a different physical host and upstream switch?
If you leave a ping going to some external ip (e.g. 8.8.8.8) from the VR, do you still lose connectivity?

________________________________________
From: Nux! <nu...@li.nux.ro>
Sent: Monday, June 1, 2015 12:15 PM
To: users@cloudstack.apache.org
Cc: dev@cloudstack.apache.org
Subject: Re: Regular total loss of connectivity

Ok, no luck with e1000 either, connectivity is lost after 4 hours (14400 sec). That can't be random.

Any ideas?

--
Sent from the Delta quadrant using Borg technology!

Nux!
www.nux.ro

----- Original Message -----
> From: "Nux!" <nu...@li.nux.ro>
> To: users@cloudstack.apache.org
> Cc: dev@cloudstack.apache.org
> Sent: Monday, 1 June, 2015 14:35:09
> Subject: Re: Regular total loss of connectivity

> Nope, it's a regular, non-redundant VR.
>
> I've switched to using e1000 instead of virtio, waiting for a few hours, so how
> it pans out. :-)
>
> --
> Sent from the Delta quadrant using Borg technology!
>
> Nux!
> www.nux.ro
>
> ----- Original Message -----
>> From: "Simon Weller" <sw...@ena.com>
>> To: dev@cloudstack.apache.org
>> Cc: "Cloudstack Users List" <us...@cloudstack.apache.org>
>> Sent: Monday, 1 June, 2015 13:32:26
>> Subject: Re: Regular total loss of connectivity
>
>> Is this VR in a redundant pair? If so, does stopping the master and allowing the
>> slave to take over allow the flow of traffic to resume?
>>
>>
>> ________________________________________
>> From: Nux! <nu...@li.nux.ro>
>> Sent: Monday, June 1, 2015 6:45 AM
>> To: dev@cloudstack.apache.org
>> Cc: Cloudstack Users List
>> Subject: Re: Regular total loss of connectivity
>>
>> Thanks Simon,
>>
>> link up/down has not helped, setting tso etc off on the link has not helped
>> either.
>> Connectivity is lost as usual after ~4 hours.
>>
>> I found some suggestions to try and use the e1000 nic instead of virtio, will do
>> that.
>>
>> Lucian
>>
>> --
>> Sent from the Delta quadrant using Borg technology!
>>
>> Nux!
>> www.nux.ro
>>
>> ----- Original Message -----
>>> From: "Simon Weller" <sw...@ena.com>
>>> To: "dev" <de...@cloudstack.apache.org>, "Cloudstack Users List"
>>> <us...@cloudstack.apache.org>
>>> Sent: Sunday, 31 May, 2015 22:36:56
>>> Subject: Re: Regular total loss of connectivity
>>
>>> If you ifdown the interface on the router and then ifup it again, does the arp
>>> problem resolve itself?
>>>
>>> We've seen a similar issue before caused by malicious/heavy traffic related to
>>> this bug:
>>>
>>> https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/997978
>>>
>>> - Si
>>> ________________________________________
>>> From: Nux! <nu...@li.nux.ro>
>>> Sent: Sunday, May 31, 2015 4:25 PM
>>> To: dev; Cloudstack Users List
>>> Subject: Regular total loss of connectivity
>>>
>>> Hi,
>>>
>>> Following a power cut, one of my cloudstack deployments is having a really weird
>>> problem that I cannot seem to solve on my own.
>>> Every 3 hours all the public IPs on the VR stop responding from the Internet.
>>> From the VR they are of course all reachable.
>>> In the same VLAN as the public IPs there is another physical server, this one
>>> can also access the VMs on their IPs just fine.
>>>
>>> The provider has not found the problem and hints at problems with the cloud
>>> platform, however cloudstack worked just fine until the power cut, not to
>>> mention the problem persists through HV and ACS upgrades.
>>>
>>> I'm thinking network side arp issues or something like this, alas I am not that
>>> good with network stuff and don't have access to it anyway.
>>>
>>> If I reboot the VR once or twice the IPs start working again and the VMs are
>>> accessible from the internet.
>>>
>>> Ideas?
>>>
>>> Env: CentOS 6, KVM, ACS 44 to 4.5.1, Adv zone
>>>
>>> Lucian
>>>
>>> --
>>> Sent from the Delta quadrant using Borg technology!
>>>
>>> Nux!
> > > www.nux.ro

Re: Regular total loss of connectivity

Posted by Nux! <nu...@li.nux.ro>.

Ok, no luck with e1000 either, connectivity is lost after 4 hours (14400 sec). That can't be random.

Any ideas?

--
Sent from the Delta quadrant using Borg technology!

Nux!
www.nux.ro

----- Original Message -----
> From: "Nux!" <nu...@li.nux.ro>
> To: users@cloudstack.apache.org
> Cc: dev@cloudstack.apache.org
> Sent: Monday, 1 June, 2015 14:35:09
> Subject: Re: Regular total loss of connectivity

> Nope, it's a regular, non-redundant VR.
> 
> I've switched to using e1000 instead of virtio, waiting for a few hours, so how
> it pans out. :-)
> 
> --
> Sent from the Delta quadrant using Borg technology!
> 
> Nux!
> www.nux.ro
> 
> ----- Original Message -----
>> From: "Simon Weller" <sw...@ena.com>
>> To: dev@cloudstack.apache.org
>> Cc: "Cloudstack Users List" <us...@cloudstack.apache.org>
>> Sent: Monday, 1 June, 2015 13:32:26
>> Subject: Re: Regular total loss of connectivity
> 
>> Is this VR in a redundant pair? If so, does stopping the master and allowing the
>> slave to take over allow the flow of traffic to resume?
>> 
>> 
>> ________________________________________
>> From: Nux! <nu...@li.nux.ro>
>> Sent: Monday, June 1, 2015 6:45 AM
>> To: dev@cloudstack.apache.org
>> Cc: Cloudstack Users List
>> Subject: Re: Regular total loss of connectivity
>> 
>> Thanks Simon,
>> 
>> link up/down has not helped, setting tso etc off on the link has not helped
>> either.
>> Connectivity is lost as usual after ~4 hours.
>> 
>> I found some suggestions to try and use the e1000 nic instead of virtio, will do
>> that.
>> 
>> Lucian
>> 
>> --
>> Sent from the Delta quadrant using Borg technology!
>> 
>> Nux!
>> www.nux.ro
>> 
>> ----- Original Message -----
>>> From: "Simon Weller" <sw...@ena.com>
>>> To: "dev" <de...@cloudstack.apache.org>, "Cloudstack Users List"
>>> <us...@cloudstack.apache.org>
>>> Sent: Sunday, 31 May, 2015 22:36:56
>>> Subject: Re: Regular total loss of connectivity
>> 
>>> If you ifdown the interface on the router and then ifup it again, does the arp
>>> problem resolve itself?
>>>
>>> We've seen a similar issue before caused by malicious/heavy traffic related to
>>> this bug:
>>>
>>> https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/997978
>>>
>>> - Si
>>> ________________________________________
>>> From: Nux! <nu...@li.nux.ro>
>>> Sent: Sunday, May 31, 2015 4:25 PM
>>> To: dev; Cloudstack Users List
>>> Subject: Regular total loss of connectivity
>>>
>>> Hi,
>>>
>>> Following a power cut, one of my cloudstack deployments is having a really weird
>>> problem that I cannot seem to solve on my own.
>>> Every 3 hours all the public IPs on the VR stop responding from the Internet.
>>> From the VR they are of course all reachable.
>>> In the same VLAN as the public IPs there is another physical server, this one
>>> can also access the VMs on their IPs just fine.
>>>
>>> The provider has not found the problem and hints at problems with the cloud
>>> platform, however cloudstack worked just fine until the power cut, not to
>>> mention the problem persists through HV and ACS upgrades.
>>>
>>> I'm thinking network side arp issues or something like this, alas I am not that
>>> good with network stuff and don't have access to it anyway.
>>>
>>> If I reboot the VR once or twice the IPs start working again and the VMs are
>>> accessible from the internet.
>>>
>>> Ideas?
>>>
>>> Env: CentOS 6, KVM, ACS 44 to 4.5.1, Adv zone
>>>
>>> Lucian
>>>
>>> --
>>> Sent from the Delta quadrant using Borg technology!
>>>
>>> Nux!
> > > www.nux.ro

Re: Regular total loss of connectivity

Posted by Nux! <nu...@li.nux.ro>.

Ok, no luck with e1000 either, connectivity is lost after 4 hours (14400 sec). That can't be random.

Any ideas?

--
Sent from the Delta quadrant using Borg technology!

Nux!
www.nux.ro

----- Original Message -----
> From: "Nux!" <nu...@li.nux.ro>
> To: users@cloudstack.apache.org
> Cc: dev@cloudstack.apache.org
> Sent: Monday, 1 June, 2015 14:35:09
> Subject: Re: Regular total loss of connectivity

> Nope, it's a regular, non-redundant VR.
> 
> I've switched to using e1000 instead of virtio, waiting for a few hours, so how
> it pans out. :-)
> 
> --
> Sent from the Delta quadrant using Borg technology!
> 
> Nux!
> www.nux.ro
> 
> ----- Original Message -----
>> From: "Simon Weller" <sw...@ena.com>
>> To: dev@cloudstack.apache.org
>> Cc: "Cloudstack Users List" <us...@cloudstack.apache.org>
>> Sent: Monday, 1 June, 2015 13:32:26
>> Subject: Re: Regular total loss of connectivity
> 
>> Is this VR in a redundant pair? If so, does stopping the master and allowing the
>> slave to take over allow the flow of traffic to resume?
>> 
>> 
>> ________________________________________
>> From: Nux! <nu...@li.nux.ro>
>> Sent: Monday, June 1, 2015 6:45 AM
>> To: dev@cloudstack.apache.org
>> Cc: Cloudstack Users List
>> Subject: Re: Regular total loss of connectivity
>> 
>> Thanks Simon,
>> 
>> link up/down has not helped, setting tso etc off on the link has not helped
>> either.
>> Connectivity is lost as usual after ~4 hours.
>> 
>> I found some suggestions to try and use the e1000 nic instead of virtio, will do
>> that.
>> 
>> Lucian
>> 
>> --
>> Sent from the Delta quadrant using Borg technology!
>> 
>> Nux!
>> www.nux.ro
>> 
>> ----- Original Message -----
>>> From: "Simon Weller" <sw...@ena.com>
>>> To: "dev" <de...@cloudstack.apache.org>, "Cloudstack Users List"
>>> <us...@cloudstack.apache.org>
>>> Sent: Sunday, 31 May, 2015 22:36:56
>>> Subject: Re: Regular total loss of connectivity
>> 
>>> If you ifdown the interface on the router and then ifup it again, does the arp
>>> problem resolve itself?
>>>
>>> We've seen a similar issue before caused by malicious/heavy traffic related to
>>> this bug:
>>>
>>> https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/997978
>>>
>>> - Si
>>> ________________________________________
>>> From: Nux! <nu...@li.nux.ro>
>>> Sent: Sunday, May 31, 2015 4:25 PM
>>> To: dev; Cloudstack Users List
>>> Subject: Regular total loss of connectivity
>>>
>>> Hi,
>>>
>>> Following a power cut, one of my cloudstack deployments is having a really weird
>>> problem that I cannot seem to solve on my own.
>>> Every 3 hours all the public IPs on the VR stop responding from the Internet.
>>> From the VR they are of course all reachable.
>>> In the same VLAN as the public IPs there is another physical server, this one
>>> can also access the VMs on their IPs just fine.
>>>
>>> The provider has not found the problem and hints at problems with the cloud
>>> platform, however cloudstack worked just fine until the power cut, not to
>>> mention the problem persists through HV and ACS upgrades.
>>>
>>> I'm thinking network side arp issues or something like this, alas I am not that
>>> good with network stuff and don't have access to it anyway.
>>>
>>> If I reboot the VR once or twice the IPs start working again and the VMs are
>>> accessible from the internet.
>>>
>>> Ideas?
>>>
>>> Env: CentOS 6, KVM, ACS 44 to 4.5.1, Adv zone
>>>
>>> Lucian
>>>
>>> --
>>> Sent from the Delta quadrant using Borg technology!
>>>
>>> Nux!
> > > www.nux.ro

Re: Regular total loss of connectivity

Posted by Nux! <nu...@li.nux.ro>.

Nope, it's a regular, non-redundant VR.

I've switched to using e1000 instead of virtio, waiting for a few hours, so how it pans out. :-)

--
Sent from the Delta quadrant using Borg technology!

Nux!
www.nux.ro

----- Original Message -----
> From: "Simon Weller" <sw...@ena.com>
> To: dev@cloudstack.apache.org
> Cc: "Cloudstack Users List" <us...@cloudstack.apache.org>
> Sent: Monday, 1 June, 2015 13:32:26
> Subject: Re: Regular total loss of connectivity

> Is this VR in a redundant pair? If so, does stopping the master and allowing the
> slave to take over allow the flow of traffic to resume?
> 
> 
> ________________________________________
> From: Nux! <nu...@li.nux.ro>
> Sent: Monday, June 1, 2015 6:45 AM
> To: dev@cloudstack.apache.org
> Cc: Cloudstack Users List
> Subject: Re: Regular total loss of connectivity
> 
> Thanks Simon,
> 
> link up/down has not helped, setting tso etc off on the link has not helped
> either.
> Connectivity is lost as usual after ~4 hours.
> 
> I found some suggestions to try and use the e1000 nic instead of virtio, will do
> that.
> 
> Lucian
> 
> --
> Sent from the Delta quadrant using Borg technology!
> 
> Nux!
> www.nux.ro
> 
> ----- Original Message -----
>> From: "Simon Weller" <sw...@ena.com>
>> To: "dev" <de...@cloudstack.apache.org>, "Cloudstack Users List"
>> <us...@cloudstack.apache.org>
>> Sent: Sunday, 31 May, 2015 22:36:56
>> Subject: Re: Regular total loss of connectivity
> 
>> If you ifdown the interface on the router and then ifup it again, does the arp
>> problem resolve itself?
>>
>> We've seen a similar issue before caused by malicious/heavy traffic related to
>> this bug:
>>
>> https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/997978
>>
>> - Si
>> ________________________________________
>> From: Nux! <nu...@li.nux.ro>
>> Sent: Sunday, May 31, 2015 4:25 PM
>> To: dev; Cloudstack Users List
>> Subject: Regular total loss of connectivity
>>
>> Hi,
>>
>> Following a power cut, one of my cloudstack deployments is having a really weird
>> problem that I cannot seem to solve on my own.
>> Every 3 hours all the public IPs on the VR stop responding from the Internet.
>> From the VR they are of course all reachable.
>> In the same VLAN as the public IPs there is another physical server, this one
>> can also access the VMs on their IPs just fine.
>>
>> The provider has not found the problem and hints at problems with the cloud
>> platform, however cloudstack worked just fine until the power cut, not to
>> mention the problem persists through HV and ACS upgrades.
>>
>> I'm thinking network side arp issues or something like this, alas I am not that
>> good with network stuff and don't have access to it anyway.
>>
>> If I reboot the VR once or twice the IPs start working again and the VMs are
>> accessible from the internet.
>>
>> Ideas?
>>
>> Env: CentOS 6, KVM, ACS 44 to 4.5.1, Adv zone
>>
>> Lucian
>>
>> --
>> Sent from the Delta quadrant using Borg technology!
>>
>> Nux!
> > www.nux.ro

Re: Regular total loss of connectivity

Posted by Nux! <nu...@li.nux.ro>.

Nope, it's a regular, non-redundant VR.

I've switched to using e1000 instead of virtio, waiting for a few hours, so how it pans out. :-)

--
Sent from the Delta quadrant using Borg technology!

Nux!
www.nux.ro

----- Original Message -----
> From: "Simon Weller" <sw...@ena.com>
> To: dev@cloudstack.apache.org
> Cc: "Cloudstack Users List" <us...@cloudstack.apache.org>
> Sent: Monday, 1 June, 2015 13:32:26
> Subject: Re: Regular total loss of connectivity

> Is this VR in a redundant pair? If so, does stopping the master and allowing the
> slave to take over allow the flow of traffic to resume?
> 
> 
> ________________________________________
> From: Nux! <nu...@li.nux.ro>
> Sent: Monday, June 1, 2015 6:45 AM
> To: dev@cloudstack.apache.org
> Cc: Cloudstack Users List
> Subject: Re: Regular total loss of connectivity
> 
> Thanks Simon,
> 
> link up/down has not helped, setting tso etc off on the link has not helped
> either.
> Connectivity is lost as usual after ~4 hours.
> 
> I found some suggestions to try and use the e1000 nic instead of virtio, will do
> that.
> 
> Lucian
> 
> --
> Sent from the Delta quadrant using Borg technology!
> 
> Nux!
> www.nux.ro
> 
> ----- Original Message -----
>> From: "Simon Weller" <sw...@ena.com>
>> To: "dev" <de...@cloudstack.apache.org>, "Cloudstack Users List"
>> <us...@cloudstack.apache.org>
>> Sent: Sunday, 31 May, 2015 22:36:56
>> Subject: Re: Regular total loss of connectivity
> 
>> If you ifdown the interface on the router and then ifup it again, does the arp
>> problem resolve itself?
>>
>> We've seen a similar issue before caused by malicious/heavy traffic related to
>> this bug:
>>
>> https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/997978
>>
>> - Si
>> ________________________________________
>> From: Nux! <nu...@li.nux.ro>
>> Sent: Sunday, May 31, 2015 4:25 PM
>> To: dev; Cloudstack Users List
>> Subject: Regular total loss of connectivity
>>
>> Hi,
>>
>> Following a power cut, one of my cloudstack deployments is having a really weird
>> problem that I cannot seem to solve on my own.
>> Every 3 hours all the public IPs on the VR stop responding from the Internet.
>> From the VR they are of course all reachable.
>> In the same VLAN as the public IPs there is another physical server, this one
>> can also access the VMs on their IPs just fine.
>>
>> The provider has not found the problem and hints at problems with the cloud
>> platform, however cloudstack worked just fine until the power cut, not to
>> mention the problem persists through HV and ACS upgrades.
>>
>> I'm thinking network side arp issues or something like this, alas I am not that
>> good with network stuff and don't have access to it anyway.
>>
>> If I reboot the VR once or twice the IPs start working again and the VMs are
>> accessible from the internet.
>>
>> Ideas?
>>
>> Env: CentOS 6, KVM, ACS 44 to 4.5.1, Adv zone
>>
>> Lucian
>>
>> --
>> Sent from the Delta quadrant using Borg technology!
>>
>> Nux!
> > www.nux.ro

Re: Regular total loss of connectivity

Posted by Simon Weller <sw...@ena.com>.

Is this VR in a redundant pair? If so, does stopping the master and allowing the slave to take over allow the flow of traffic to resume?


________________________________________
From: Nux! <nu...@li.nux.ro>
Sent: Monday, June 1, 2015 6:45 AM
To: dev@cloudstack.apache.org
Cc: Cloudstack Users List
Subject: Re: Regular total loss of connectivity

Thanks Simon,

link up/down has not helped, setting tso etc off on the link has not helped either.
Connectivity is lost as usual after ~4 hours.

I found some suggestions to try and use the e1000 nic instead of virtio, will do that.

Lucian

--
Sent from the Delta quadrant using Borg technology!

Nux!
www.nux.ro

----- Original Message -----
> From: "Simon Weller" <sw...@ena.com>
> To: "dev" <de...@cloudstack.apache.org>, "Cloudstack Users List" <us...@cloudstack.apache.org>
> Sent: Sunday, 31 May, 2015 22:36:56
> Subject: Re: Regular total loss of connectivity

> If you ifdown the interface on the router and then ifup it again, does the arp
> problem resolve itself?
>
> We've seen a similar issue before caused by malicious/heavy traffic related to
> this bug:
>
> https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/997978
>
> - Si
> ________________________________________
> From: Nux! <nu...@li.nux.ro>
> Sent: Sunday, May 31, 2015 4:25 PM
> To: dev; Cloudstack Users List
> Subject: Regular total loss of connectivity
>
> Hi,
>
> Following a power cut, one of my cloudstack deployments is having a really weird
> problem that I cannot seem to solve on my own.
> Every 3 hours all the public IPs on the VR stop responding from the Internet.
> From the VR they are of course all reachable.
> In the same VLAN as the public IPs there is another physical server, this one
> can also access the VMs on their IPs just fine.
>
> The provider has not found the problem and hints at problems with the cloud
> platform, however cloudstack worked just fine until the power cut, not to
> mention the problem persists through HV and ACS upgrades.
>
> I'm thinking network side arp issues or something like this, alas I am not that
> good with network stuff and don't have access to it anyway.
>
> If I reboot the VR once or twice the IPs start working again and the VMs are
> accessible from the internet.
>
> Ideas?
>
> Env: CentOS 6, KVM, ACS 44 to 4.5.1, Adv zone
>
> Lucian
>
> --
> Sent from the Delta quadrant using Borg technology!
>
> Nux!
> www.nux.ro

Re: Regular total loss of connectivity

Posted by Simon Weller <sw...@ena.com>.

Is this VR in a redundant pair? If so, does stopping the master and allowing the slave to take over allow the flow of traffic to resume?


________________________________________
From: Nux! <nu...@li.nux.ro>
Sent: Monday, June 1, 2015 6:45 AM
To: dev@cloudstack.apache.org
Cc: Cloudstack Users List
Subject: Re: Regular total loss of connectivity

Thanks Simon,

link up/down has not helped, setting tso etc off on the link has not helped either.
Connectivity is lost as usual after ~4 hours.

I found some suggestions to try and use the e1000 nic instead of virtio, will do that.

Lucian

--
Sent from the Delta quadrant using Borg technology!

Nux!
www.nux.ro

----- Original Message -----
> From: "Simon Weller" <sw...@ena.com>
> To: "dev" <de...@cloudstack.apache.org>, "Cloudstack Users List" <us...@cloudstack.apache.org>
> Sent: Sunday, 31 May, 2015 22:36:56
> Subject: Re: Regular total loss of connectivity

> If you ifdown the interface on the router and then ifup it again, does the arp
> problem resolve itself?
>
> We've seen a similar issue before caused by malicious/heavy traffic related to
> this bug:
>
> https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/997978
>
> - Si
> ________________________________________
> From: Nux! <nu...@li.nux.ro>
> Sent: Sunday, May 31, 2015 4:25 PM
> To: dev; Cloudstack Users List
> Subject: Regular total loss of connectivity
>
> Hi,
>
> Following a power cut, one of my cloudstack deployments is having a really weird
> problem that I cannot seem to solve on my own.
> Every 3 hours all the public IPs on the VR stop responding from the Internet.
> From the VR they are of course all reachable.
> In the same VLAN as the public IPs there is another physical server, this one
> can also access the VMs on their IPs just fine.
>
> The provider has not found the problem and hints at problems with the cloud
> platform, however cloudstack worked just fine until the power cut, not to
> mention the problem persists through HV and ACS upgrades.
>
> I'm thinking network side arp issues or something like this, alas I am not that
> good with network stuff and don't have access to it anyway.
>
> If I reboot the VR once or twice the IPs start working again and the VMs are
> accessible from the internet.
>
> Ideas?
>
> Env: CentOS 6, KVM, ACS 44 to 4.5.1, Adv zone
>
> Lucian
>
> --
> Sent from the Delta quadrant using Borg technology!
>
> Nux!
> www.nux.ro