You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@cloudstack.apache.org by Nick Burke <ni...@nickburke.com> on 2014/08/15 20:40:21 UTC

intermittent packet loss after upgrading and restarting networks

I upgraded from 4.0 to 4.3.0 some time ago. I didn't restart anything and
it was all working great. However, I had to perform some maintenance and
had to restart everything. Now, I'm seeing packet loss on all virtuals,
even ones on the same host.

sudo ping -c 500  -f 172.20.1.1
PING 172.20.1.1 (172.20.1.1) 56(84) bytes of data.
........................................
--- 172.20.1.1 ping statistics ---
500 packets transmitted, 460 received, 8% packet loss, time 864ms
rtt min/avg/max/mdev = 0.069/0.218/1.290/0.139 ms, ipg/ewma 1.731/0.328 ms

No interface errors reported anywhere. The host itself isn't under load at
all. Doesn't matter if the instance uses e1000 or virtio for the drivers.
The only thing that I'm aware of that changed was that I had to reboot all
the physical servers.


Could be related, but I was hit with the

https://issues.apache.org/jira/browse/CLOUDSTACK-6464

bug. I did follow with Marcus' suggestion:


*"This is a shot in the dark, but there have been some issues around
upgrades that involve the cloud.vlan table expected contents changing. New
4.3 installs using vlan isolation don't seem to reproduce the issue. I'll
see if I can reproduce anything like this with basic and/or non-vlan
isolated upgrades/installs. Can anyone experiencing an issue look at their
database via something like "select * from cloud.vlan" and look at the
vlan_id. If you see something like "untagged" instead of "vlan://untagged",
please try changing it and see if that helps."*

-- 
Nick





*'What is a human being, then?' 'A seed' 'A... seed?' 'An acorn that is
unafraid to destroy itself in growing into a tree.' -David Zindell, A
Requiem for Homo Sapiens*

Re: intermittent packet loss after upgrading and restarting networks

Posted by Kirk Kosinski <ki...@gmail.com>.

Hi, did you check the documentation?  Specifically the Network
Throttling section [1].  In CloudStack the throttling can be configured
in a variety of places and the net effect can be affected by network
type and hypervisor, so it is hard to determine.  If you haven't
already, check out the doc should since it might be helpful.

Kirk

[1]
http://cloudstack.apache.org/docs/en-US/Apache_CloudStack/4.2.0/html/Admin_Guide/network-rate.html

On 08/17/2014 04:38 PM, Nick Burke wrote:
> Another update:
> 
> 
> 100% confirmed to be traffic shapping set by CloudStack. I don't know
> where/how/why, and I'd love some help with this. Should I create a new
> thread? As previously mentioned, I don't believe I've set a cap of below
> 100Mbs ANYWHERE in Cloudstack. Not in compute offerings, network offerings,
> and not in the default throttle (which is set at 200).
> 
> What am I missing?
> 
> I removed tc rules on the host for two test instances and bandwidth shot up.
> 
> Before:
> 
> ubuntu@testserver01:~$ iperf -s
> ------------------------------------------------------------
> Server listening on TCP port 5001
> TCP window size: 85.3 KByte (default)
> ------------------------------------------------------------
> [  4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59276
> [ ID] Interval       Transfer     Bandwidth
> [  4]  0.0-10.4 sec  6.62 MBytes  5.35 Mbits/sec
> [  5] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59277
> [  5]  0.0-10.5 sec  6.62 MBytes  5.28 Mbits/sec
> [  4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59278
> [  4]  0.0-10.4 sec  6.62 MBytes  5.37 Mbits/sec
> [  5] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59291
> [  5]  0.0-10.3 sec  6.62 MBytes  5.37 Mbits/sec
> [  4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59306
> [  4]  0.0-10.5 sec  6.62 MBytes  5.30 Mbits/sec
> 
> Removed the rules for two instances on the same host:
> 
> ubuntu@dom02:~$ sudo tc qdisc del dev vnet1 root
> ubuntu@dom02:~$ sudo tc qdisc del dev vnet3 root
> ubuntu@dom02:~$ sudo tc qdisc del dev vnet3 ingress
> ubuntu@dom02:~$ sudo tc qdisc del dev vnet1 ingress
> ubuntu@dom02:~$ tc -s qdisc ls dev vnet1
> qdisc pfifo_fast 0: root refcnt 2 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1
> 1 1 1 1
>  Sent 7136572 bytes 1048 pkt (dropped 0, overlimits 0 requeues 0)
>  backlog 0b 0p requeues 0
> 
> And all of a sudden, those two instances are at blazing speeds:
> 
> ubuntu@testserver01:~$ iperf -s
> ------------------------------------------------------------
> Server listening on TCP port 5001
> TCP window size: 85.3 KByte (default)
> ------------------------------------------------------------
> [  4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59322
> [ ID] Interval       Transfer     Bandwidth
> [  4]  0.0-10.0 sec  14.8 GBytes  12.7 Gbits/sec
> [  5] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59329
> [  5]  0.0-10.0 sec  19.1 GBytes  16.4 Gbits/sec
> [  4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59330
> [  4]  0.0-10.0 sec  19.0 GBytes  16.3 Gbits/sec
> 
> 
> 
> 
> 
> On Sun, Aug 17, 2014 at 12:46 PM, Nick Burke <ni...@nickburke.com> wrote:
> 
>> First,
>>
>> THANK YOU FOR REPLYING!
>>
>> Second, yes, it's currently set at 200.
>>
>> The compute offering for network is either blank (or when I tested it,
>> 1000)
>> The network offering for network limit is either 100, 1000, or blank.
>>
>>
>> Those are the only network throttling parameters that I'm aware of, are
>> there any others that I missed? Is it possible disk i/o is for some reason
>> coming into play here?
>>
>> This happens regardless of if the instance network is either a virtual
>> router or is directly connected to a vlan(ie, no virtual router) when two
>> instances are directly connected to each other.
>>
>>
>>
>>
>>
>> On Sun, Aug 17, 2014 at 12:09 PM, ilya musayev <
>> ilya.mailing.lists@gmail.com> wrote:
>>
>>> Nick
>>>
>>> Have you checked network throttle settings in "global setting" and where
>>> ever else it may be defined?
>>>
>>> regads
>>> ilya
>>>
>>> On 8/17/14, 11:27 AM, Nick Burke wrote:
>>>
>>>> Update:
>>>>
>>>> After running nperf on same instances on the same virtual network, it
>>>> looks
>>>> like all instances can get no more than 2Mb/s. Additionally, it's
>>>> sporadic
>>>> and ranges from <1Mb/s, but never more than 2Mb/s:
>>>>
>>>> user@localhost:~$ iperf -c 10.1.0.1 -d
>>>> ------------------------------------------------------------
>>>> Server listening on TCP port 5001
>>>> TCP window size: 85.3 KByte (default)
>>>> ------------------------------------------------------------
>>>> ------------------------------------------------------------
>>>> Client connecting to 10.1.0.1, TCP port 5001
>>>> TCP window size: 86.8 KByte (default)
>>>> ------------------------------------------------------------
>>>> [  5] local 10.1.0.10 port 50432 connected with 10.1.0.1 port 5001
>>>> [ ID] Interval       Transfer     Bandwidth
>>>> [  5]  0.0-11.0 sec  1.25 MBytes   950 Kbits/sec
>>>> [  4] local 10.1.0.10 port 5001 connected with 10.1.0.1 port 53839
>>>> [  4]  0.0-11.1 sec  2.50 MBytes  1.89 Mbits/sec
>>>> user@localhost:~$ iperf -c 10.1.0.1 -d
>>>> ------------------------------------------------------------
>>>> Server listening on TCP port 5001
>>>> TCP window size: 85.3 KByte (default)
>>>> ------------------------------------------------------------
>>>> ------------------------------------------------------------
>>>> Client connecting to 10.1.0.1, TCP port 5001
>>>> TCP window size: 50.3 KByte (default)
>>>> ------------------------------------------------------------
>>>> [  5] local 10.1.0.10 port 52248 connected with 10.1.0.1 port 5001
>>>> [ ID] Interval       Transfer     Bandwidth
>>>> [  5]  0.0-12.6 sec  1.25 MBytes   834 Kbits/sec
>>>> [  4] local 10.1.0.10 port 5001 connected with 10.1.0.1 port 53840
>>>> [  4]  0.0-11.9 sec  2.13 MBytes  1.49 Mbits/sec
>>>>
>>>>
>>>>
>>>> On Fri, Aug 15, 2014 at 11:40 AM, Nick Burke <ni...@nickburke.com> wrote:
>>>>
>>>>  I upgraded from 4.0 to 4.3.0 some time ago. I didn't restart anything
>>>>> and
>>>>> it was all working great. However, I had to perform some maintenance and
>>>>> had to restart everything. Now, I'm seeing packet loss on all virtuals,
>>>>> even ones on the same host.
>>>>>
>>>>> sudo ping -c 500  -f 172.20.1.1
>>>>> PING 172.20.1.1 (172.20.1.1) 56(84) bytes of data.
>>>>> ........................................
>>>>> --- 172.20.1.1 ping statistics ---
>>>>> 500 packets transmitted, 460 received, 8% packet loss, time 864ms
>>>>> rtt min/avg/max/mdev = 0.069/0.218/1.290/0.139 ms, ipg/ewma 1.731/0.328
>>>>> ms
>>>>>
>>>>> No interface errors reported anywhere. The host itself isn't under load
>>>>> at
>>>>> all. Doesn't matter if the instance uses e1000 or virtio for the
>>>>> drivers.
>>>>> The only thing that I'm aware of that changed was that I had to reboot
>>>>> all
>>>>> the physical servers.
>>>>>
>>>>>
>>>>> Could be related, but I was hit with the
>>>>>
>>>>> https://issues.apache.org/jira/browse/CLOUDSTACK-6464
>>>>>
>>>>> bug. I did follow with Marcus' suggestion:
>>>>>
>>>>>
>>>>> *"This is a shot in the dark, but there have been some issues around
>>>>>
>>>>> upgrades that involve the cloud.vlan table expected contents changing.
>>>>> New
>>>>> 4.3 installs using vlan isolation don't seem to reproduce the issue.
>>>>> I'll
>>>>> see if I can reproduce anything like this with basic and/or non-vlan
>>>>> isolated upgrades/installs. Can anyone experiencing an issue look at
>>>>> their
>>>>> database via something like "select * from cloud.vlan" and look at the
>>>>> vlan_id. If you see something like "untagged" instead of
>>>>> "vlan://untagged",
>>>>> please try changing it and see if that helps."*
>>>>>
>>>>> --
>>>>> Nick
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> *'What is a human being, then?' 'A seed' 'A... seed?' 'An acorn that is
>>>>>
>>>>> unafraid to destroy itself in growing into a tree.' -David Zindell, A
>>>>> Requiem for Homo Sapiens*
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> Nick
>>
>>
>>
>>
>>
>> *'What is a human being, then?' 'A seed' 'A... seed?' 'An acorn that is
>> unafraid to destroy itself in growing into a tree.' -David Zindell, A
>> Requiem for Homo Sapiens*
>>
> 
> 
>

Re: intermittent packet loss after upgrading and restarting networks

Posted by ilya musayev <il...@gmail.com>.

This 200MB limit setting is set by default, we should change that.

If you can spend a minute of your time and place a request on 
https://issues.apache.org/jira/browse/CLOUDSTACK/, you can assign it to 
me and will push the change later.

On 8/18/14, 11:43 AM, Nick Burke wrote:
> Here are the snippets from cloudmonkey and tc:
>
> http://pastebin.com/30Fxj3PW
>
>
> On Sun, Aug 17, 2014 at 4:38 PM, Nick Burke <ni...@nickburke.com> wrote:
>
>> Another update:
>>
>>
>> 100% confirmed to be traffic shapping set by CloudStack. I don't know
>> where/how/why, and I'd love some help with this. Should I create a new
>> thread? As previously mentioned, I don't believe I've set a cap of below
>> 100Mbs ANYWHERE in Cloudstack. Not in compute offerings, network offerings,
>> and not in the default throttle (which is set at 200).
>>
>> What am I missing?
>>
>> I removed tc rules on the host for two test instances and bandwidth shot
>> up.
>>
>> Before:
>>
>> ubuntu@testserver01:~$ iperf -s
>>
>> ------------------------------------------------------------
>> Server listening on TCP port 5001
>> TCP window size: 85.3 KByte (default)
>> ------------------------------------------------------------
>> [  4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59276
>> [ ID] Interval       Transfer     Bandwidth
>> [  4]  0.0-10.4 sec  6.62 MBytes  5.35 Mbits/sec
>> [  5] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59277
>> [  5]  0.0-10.5 sec  6.62 MBytes  5.28 Mbits/sec
>> [  4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59278
>> [  4]  0.0-10.4 sec  6.62 MBytes  5.37 Mbits/sec
>> [  5] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59291
>> [  5]  0.0-10.3 sec  6.62 MBytes  5.37 Mbits/sec
>> [  4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59306
>> [  4]  0.0-10.5 sec  6.62 MBytes  5.30 Mbits/sec
>>
>> Removed the rules for two instances on the same host:
>>
>> ubuntu@dom02:~$ sudo tc qdisc del dev vnet1 root
>> ubuntu@dom02:~$ sudo tc qdisc del dev vnet3 root
>> ubuntu@dom02:~$ sudo tc qdisc del dev vnet3 ingress
>> ubuntu@dom02:~$ sudo tc qdisc del dev vnet1 ingress
>> ubuntu@dom02:~$ tc -s qdisc ls dev vnet1
>> qdisc pfifo_fast 0: root refcnt 2 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1
>> 1 1 1 1
>>   Sent 7136572 bytes 1048 pkt (dropped 0, overlimits 0 requeues 0)
>>   backlog 0b 0p requeues 0
>>
>> And all of a sudden, those two instances are at blazing speeds:
>>
>> ubuntu@testserver01:~$ iperf -s
>>
>> ------------------------------------------------------------
>> Server listening on TCP port 5001
>> TCP window size: 85.3 KByte (default)
>> ------------------------------------------------------------
>> [  4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59322
>> [ ID] Interval       Transfer     Bandwidth
>> [  4]  0.0-10.0 sec  14.8 GBytes  12.7 Gbits/sec
>> [  5] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59329
>> [  5]  0.0-10.0 sec  19.1 GBytes  16.4 Gbits/sec
>> [  4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59330
>> [  4]  0.0-10.0 sec  19.0 GBytes  16.3 Gbits/sec
>>
>>
>>
>>
>>
>> On Sun, Aug 17, 2014 at 12:46 PM, Nick Burke <ni...@nickburke.com> wrote:
>>
>>> First,
>>>
>>> THANK YOU FOR REPLYING!
>>>
>>> Second, yes, it's currently set at 200.
>>>
>>> The compute offering for network is either blank (or when I tested it,
>>> 1000)
>>> The network offering for network limit is either 100, 1000, or blank.
>>>
>>>
>>> Those are the only network throttling parameters that I'm aware of, are
>>> there any others that I missed? Is it possible disk i/o is for some reason
>>> coming into play here?
>>>
>>> This happens regardless of if the instance network is either a virtual
>>> router or is directly connected to a vlan(ie, no virtual router) when two
>>> instances are directly connected to each other.
>>>
>>>
>>>
>>>
>>>
>>> On Sun, Aug 17, 2014 at 12:09 PM, ilya musayev <
>>> ilya.mailing.lists@gmail.com> wrote:
>>>
>>>> Nick
>>>>
>>>> Have you checked network throttle settings in "global setting" and where
>>>> ever else it may be defined?
>>>>
>>>> regads
>>>> ilya
>>>>
>>>> On 8/17/14, 11:27 AM, Nick Burke wrote:
>>>>
>>>>> Update:
>>>>>
>>>>> After running nperf on same instances on the same virtual network, it
>>>>> looks
>>>>> like all instances can get no more than 2Mb/s. Additionally, it's
>>>>> sporadic
>>>>> and ranges from <1Mb/s, but never more than 2Mb/s:
>>>>>
>>>>> user@localhost:~$ iperf -c 10.1.0.1 -d
>>>>> ------------------------------------------------------------
>>>>> Server listening on TCP port 5001
>>>>> TCP window size: 85.3 KByte (default)
>>>>> ------------------------------------------------------------
>>>>> ------------------------------------------------------------
>>>>> Client connecting to 10.1.0.1, TCP port 5001
>>>>> TCP window size: 86.8 KByte (default)
>>>>> ------------------------------------------------------------
>>>>> [  5] local 10.1.0.10 port 50432 connected with 10.1.0.1 port 5001
>>>>> [ ID] Interval       Transfer     Bandwidth
>>>>> [  5]  0.0-11.0 sec  1.25 MBytes   950 Kbits/sec
>>>>> [  4] local 10.1.0.10 port 5001 connected with 10.1.0.1 port 53839
>>>>> [  4]  0.0-11.1 sec  2.50 MBytes  1.89 Mbits/sec
>>>>> user@localhost:~$ iperf -c 10.1.0.1 -d
>>>>> ------------------------------------------------------------
>>>>> Server listening on TCP port 5001
>>>>> TCP window size: 85.3 KByte (default)
>>>>> ------------------------------------------------------------
>>>>> ------------------------------------------------------------
>>>>> Client connecting to 10.1.0.1, TCP port 5001
>>>>> TCP window size: 50.3 KByte (default)
>>>>> ------------------------------------------------------------
>>>>> [  5] local 10.1.0.10 port 52248 connected with 10.1.0.1 port 5001
>>>>> [ ID] Interval       Transfer     Bandwidth
>>>>> [  5]  0.0-12.6 sec  1.25 MBytes   834 Kbits/sec
>>>>> [  4] local 10.1.0.10 port 5001 connected with 10.1.0.1 port 53840
>>>>> [  4]  0.0-11.9 sec  2.13 MBytes  1.49 Mbits/sec
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Aug 15, 2014 at 11:40 AM, Nick Burke <ni...@nickburke.com>
>>>>> wrote:
>>>>>
>>>>>   I upgraded from 4.0 to 4.3.0 some time ago. I didn't restart anything
>>>>>> and
>>>>>> it was all working great. However, I had to perform some maintenance
>>>>>> and
>>>>>> had to restart everything. Now, I'm seeing packet loss on all virtuals,
>>>>>> even ones on the same host.
>>>>>>
>>>>>> sudo ping -c 500  -f 172.20.1.1
>>>>>> PING 172.20.1.1 (172.20.1.1) 56(84) bytes of data.
>>>>>> ........................................
>>>>>> --- 172.20.1.1 ping statistics ---
>>>>>> 500 packets transmitted, 460 received, 8% packet loss, time 864ms
>>>>>> rtt min/avg/max/mdev = 0.069/0.218/1.290/0.139 ms, ipg/ewma
>>>>>> 1.731/0.328 ms
>>>>>>
>>>>>> No interface errors reported anywhere. The host itself isn't under
>>>>>> load at
>>>>>> all. Doesn't matter if the instance uses e1000 or virtio for the
>>>>>> drivers.
>>>>>> The only thing that I'm aware of that changed was that I had to reboot
>>>>>> all
>>>>>> the physical servers.
>>>>>>
>>>>>>
>>>>>> Could be related, but I was hit with the
>>>>>>
>>>>>> https://issues.apache.org/jira/browse/CLOUDSTACK-6464
>>>>>>
>>>>>> bug. I did follow with Marcus' suggestion:
>>>>>>
>>>>>>
>>>>>> *"This is a shot in the dark, but there have been some issues around
>>>>>>
>>>>>> upgrades that involve the cloud.vlan table expected contents changing.
>>>>>> New
>>>>>> 4.3 installs using vlan isolation don't seem to reproduce the issue.
>>>>>> I'll
>>>>>> see if I can reproduce anything like this with basic and/or non-vlan
>>>>>> isolated upgrades/installs. Can anyone experiencing an issue look at
>>>>>> their
>>>>>> database via something like "select * from cloud.vlan" and look at the
>>>>>> vlan_id. If you see something like "untagged" instead of
>>>>>> "vlan://untagged",
>>>>>> please try changing it and see if that helps."*
>>>>>>
>>>>>> --
>>>>>> Nick
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> *'What is a human being, then?' 'A seed' 'A... seed?' 'An acorn that is
>>>>>>
>>>>>> unafraid to destroy itself in growing into a tree.' -David Zindell, A
>>>>>> Requiem for Homo Sapiens*
>>>>>>
>>>>>>
>>>>>
>>>
>>> --
>>> Nick
>>>
>>>
>>>
>>>
>>>
>>> *'What is a human being, then?' 'A seed' 'A... seed?' 'An acorn that is
>>> unafraid to destroy itself in growing into a tree.' -David Zindell, A
>>> Requiem for Homo Sapiens*
>>>
>>
>>
>> --
>> Nick
>>
>>
>>
>>
>>
>> *'What is a human being, then?' 'A seed' 'A... seed?' 'An acorn that is
>> unafraid to destroy itself in growing into a tree.' -David Zindell, A
>> Requiem for Homo Sapiens*
>>
>
>

Re: intermittent packet loss after upgrading and restarting networks

Posted by Nick Burke <ni...@nickburke.com>.

Here are the snippets from cloudmonkey and tc:

http://pastebin.com/30Fxj3PW


On Sun, Aug 17, 2014 at 4:38 PM, Nick Burke <ni...@nickburke.com> wrote:

> Another update:
>
>
> 100% confirmed to be traffic shapping set by CloudStack. I don't know
> where/how/why, and I'd love some help with this. Should I create a new
> thread? As previously mentioned, I don't believe I've set a cap of below
> 100Mbs ANYWHERE in Cloudstack. Not in compute offerings, network offerings,
> and not in the default throttle (which is set at 200).
>
> What am I missing?
>
> I removed tc rules on the host for two test instances and bandwidth shot
> up.
>
> Before:
>
> ubuntu@testserver01:~$ iperf -s
>
> ------------------------------------------------------------
> Server listening on TCP port 5001
> TCP window size: 85.3 KByte (default)
> ------------------------------------------------------------
> [  4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59276
> [ ID] Interval       Transfer     Bandwidth
> [  4]  0.0-10.4 sec  6.62 MBytes  5.35 Mbits/sec
> [  5] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59277
> [  5]  0.0-10.5 sec  6.62 MBytes  5.28 Mbits/sec
> [  4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59278
> [  4]  0.0-10.4 sec  6.62 MBytes  5.37 Mbits/sec
> [  5] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59291
> [  5]  0.0-10.3 sec  6.62 MBytes  5.37 Mbits/sec
> [  4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59306
> [  4]  0.0-10.5 sec  6.62 MBytes  5.30 Mbits/sec
>
> Removed the rules for two instances on the same host:
>
> ubuntu@dom02:~$ sudo tc qdisc del dev vnet1 root
> ubuntu@dom02:~$ sudo tc qdisc del dev vnet3 root
> ubuntu@dom02:~$ sudo tc qdisc del dev vnet3 ingress
> ubuntu@dom02:~$ sudo tc qdisc del dev vnet1 ingress
> ubuntu@dom02:~$ tc -s qdisc ls dev vnet1
> qdisc pfifo_fast 0: root refcnt 2 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1
> 1 1 1 1
>  Sent 7136572 bytes 1048 pkt (dropped 0, overlimits 0 requeues 0)
>  backlog 0b 0p requeues 0
>
> And all of a sudden, those two instances are at blazing speeds:
>
> ubuntu@testserver01:~$ iperf -s
>
> ------------------------------------------------------------
> Server listening on TCP port 5001
> TCP window size: 85.3 KByte (default)
> ------------------------------------------------------------
> [  4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59322
> [ ID] Interval       Transfer     Bandwidth
> [  4]  0.0-10.0 sec  14.8 GBytes  12.7 Gbits/sec
> [  5] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59329
> [  5]  0.0-10.0 sec  19.1 GBytes  16.4 Gbits/sec
> [  4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59330
> [  4]  0.0-10.0 sec  19.0 GBytes  16.3 Gbits/sec
>
>
>
>
>
> On Sun, Aug 17, 2014 at 12:46 PM, Nick Burke <ni...@nickburke.com> wrote:
>
>> First,
>>
>> THANK YOU FOR REPLYING!
>>
>> Second, yes, it's currently set at 200.
>>
>> The compute offering for network is either blank (or when I tested it,
>> 1000)
>> The network offering for network limit is either 100, 1000, or blank.
>>
>>
>> Those are the only network throttling parameters that I'm aware of, are
>> there any others that I missed? Is it possible disk i/o is for some reason
>> coming into play here?
>>
>> This happens regardless of if the instance network is either a virtual
>> router or is directly connected to a vlan(ie, no virtual router) when two
>> instances are directly connected to each other.
>>
>>
>>
>>
>>
>> On Sun, Aug 17, 2014 at 12:09 PM, ilya musayev <
>> ilya.mailing.lists@gmail.com> wrote:
>>
>>> Nick
>>>
>>> Have you checked network throttle settings in "global setting" and where
>>> ever else it may be defined?
>>>
>>> regads
>>> ilya
>>>
>>> On 8/17/14, 11:27 AM, Nick Burke wrote:
>>>
>>>> Update:
>>>>
>>>> After running nperf on same instances on the same virtual network, it
>>>> looks
>>>> like all instances can get no more than 2Mb/s. Additionally, it's
>>>> sporadic
>>>> and ranges from <1Mb/s, but never more than 2Mb/s:
>>>>
>>>> user@localhost:~$ iperf -c 10.1.0.1 -d
>>>> ------------------------------------------------------------
>>>> Server listening on TCP port 5001
>>>> TCP window size: 85.3 KByte (default)
>>>> ------------------------------------------------------------
>>>> ------------------------------------------------------------
>>>> Client connecting to 10.1.0.1, TCP port 5001
>>>> TCP window size: 86.8 KByte (default)
>>>> ------------------------------------------------------------
>>>> [  5] local 10.1.0.10 port 50432 connected with 10.1.0.1 port 5001
>>>> [ ID] Interval       Transfer     Bandwidth
>>>> [  5]  0.0-11.0 sec  1.25 MBytes   950 Kbits/sec
>>>> [  4] local 10.1.0.10 port 5001 connected with 10.1.0.1 port 53839
>>>> [  4]  0.0-11.1 sec  2.50 MBytes  1.89 Mbits/sec
>>>> user@localhost:~$ iperf -c 10.1.0.1 -d
>>>> ------------------------------------------------------------
>>>> Server listening on TCP port 5001
>>>> TCP window size: 85.3 KByte (default)
>>>> ------------------------------------------------------------
>>>> ------------------------------------------------------------
>>>> Client connecting to 10.1.0.1, TCP port 5001
>>>> TCP window size: 50.3 KByte (default)
>>>> ------------------------------------------------------------
>>>> [  5] local 10.1.0.10 port 52248 connected with 10.1.0.1 port 5001
>>>> [ ID] Interval       Transfer     Bandwidth
>>>> [  5]  0.0-12.6 sec  1.25 MBytes   834 Kbits/sec
>>>> [  4] local 10.1.0.10 port 5001 connected with 10.1.0.1 port 53840
>>>> [  4]  0.0-11.9 sec  2.13 MBytes  1.49 Mbits/sec
>>>>
>>>>
>>>>
>>>> On Fri, Aug 15, 2014 at 11:40 AM, Nick Burke <ni...@nickburke.com>
>>>> wrote:
>>>>
>>>>  I upgraded from 4.0 to 4.3.0 some time ago. I didn't restart anything
>>>>> and
>>>>> it was all working great. However, I had to perform some maintenance
>>>>> and
>>>>> had to restart everything. Now, I'm seeing packet loss on all virtuals,
>>>>> even ones on the same host.
>>>>>
>>>>> sudo ping -c 500  -f 172.20.1.1
>>>>> PING 172.20.1.1 (172.20.1.1) 56(84) bytes of data.
>>>>> ........................................
>>>>> --- 172.20.1.1 ping statistics ---
>>>>> 500 packets transmitted, 460 received, 8% packet loss, time 864ms
>>>>> rtt min/avg/max/mdev = 0.069/0.218/1.290/0.139 ms, ipg/ewma
>>>>> 1.731/0.328 ms
>>>>>
>>>>> No interface errors reported anywhere. The host itself isn't under
>>>>> load at
>>>>> all. Doesn't matter if the instance uses e1000 or virtio for the
>>>>> drivers.
>>>>> The only thing that I'm aware of that changed was that I had to reboot
>>>>> all
>>>>> the physical servers.
>>>>>
>>>>>
>>>>> Could be related, but I was hit with the
>>>>>
>>>>> https://issues.apache.org/jira/browse/CLOUDSTACK-6464
>>>>>
>>>>> bug. I did follow with Marcus' suggestion:
>>>>>
>>>>>
>>>>> *"This is a shot in the dark, but there have been some issues around
>>>>>
>>>>> upgrades that involve the cloud.vlan table expected contents changing.
>>>>> New
>>>>> 4.3 installs using vlan isolation don't seem to reproduce the issue.
>>>>> I'll
>>>>> see if I can reproduce anything like this with basic and/or non-vlan
>>>>> isolated upgrades/installs. Can anyone experiencing an issue look at
>>>>> their
>>>>> database via something like "select * from cloud.vlan" and look at the
>>>>> vlan_id. If you see something like "untagged" instead of
>>>>> "vlan://untagged",
>>>>> please try changing it and see if that helps."*
>>>>>
>>>>> --
>>>>> Nick
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> *'What is a human being, then?' 'A seed' 'A... seed?' 'An acorn that is
>>>>>
>>>>> unafraid to destroy itself in growing into a tree.' -David Zindell, A
>>>>> Requiem for Homo Sapiens*
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> Nick
>>
>>
>>
>>
>>
>> *'What is a human being, then?' 'A seed' 'A... seed?' 'An acorn that is
>> unafraid to destroy itself in growing into a tree.' -David Zindell, A
>> Requiem for Homo Sapiens*
>>
>
>
>
> --
> Nick
>
>
>
>
>
> *'What is a human being, then?' 'A seed' 'A... seed?' 'An acorn that is
> unafraid to destroy itself in growing into a tree.' -David Zindell, A
> Requiem for Homo Sapiens*
>



-- 
Nick





*'What is a human being, then?' 'A seed' 'A... seed?' 'An acorn that is
unafraid to destroy itself in growing into a tree.' -David Zindell, A
Requiem for Homo Sapiens*

Re: intermittent packet loss after upgrading and restarting networks

Posted by ilya musayev <il...@gmail.com>.

Or perhaps this one.

http://cloudstack.apache.org/docs/api/apidocs-4.4/root_admin/restartNetwork.html

On 8/18/14, 1:07 PM, ilya musayev wrote:
> Nick,
>
> I dont believe we throttle disks unless you have a storage that has 
> direct integration to limits iops like solidfire or possibly netapp.
>
> The change is rather simple, in the global settings level - override 
> the throttle configs. They should generally be inherited from 
> upstream, if it did not - let me know and i can try to point you to a 
> db update  you can do.
>
> Once thats done, next time you do a deployment of a vm, it will check 
> the network portgroup it has created and update it. You can also try 
> doing stop and start of the VM, it may update the portgroup configs as 
> well (not 100% certain, but i think it will work). This behavior 
> definitely applies to vmware, i'd think the same would go for other 
> hypervisors like XEN and KVM - but I dont have XEN or KVM to try this on.
>
> One other suggestion, I would ask on dev list, there is an 
> updateNetwork api call that you could make - that presumable will 
> update these settings, the description for this call is rather brief, 
> hence devs would know better.
>
> http://cloudstack.apache.org/docs/api/apidocs-4.4/root_admin/updateNetwork.html 
>
>
> Regards
> ilya
>
> On 8/17/14, 4:38 PM, Nick Burke wrote:
>> Another update:
>>
>>
>> 100% confirmed to be traffic shapping set by CloudStack. I don't know
>> where/how/why, and I'd love some help with this. Should I create a new
>> thread? As previously mentioned, I don't believe I've set a cap of below
>> 100Mbs ANYWHERE in Cloudstack. Not in compute offerings, network 
>> offerings,
>> and not in the default throttle (which is set at 200).
>>
>> What am I missing?
>>
>> I removed tc rules on the host for two test instances and bandwidth 
>> shot up.
>>
>> Before:
>>
>> ubuntu@testserver01:~$ iperf -s
>> ------------------------------------------------------------
>> Server listening on TCP port 5001
>> TCP window size: 85.3 KByte (default)
>> ------------------------------------------------------------
>> [  4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59276
>> [ ID] Interval       Transfer     Bandwidth
>> [  4]  0.0-10.4 sec  6.62 MBytes  5.35 Mbits/sec
>> [  5] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59277
>> [  5]  0.0-10.5 sec  6.62 MBytes  5.28 Mbits/sec
>> [  4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59278
>> [  4]  0.0-10.4 sec  6.62 MBytes  5.37 Mbits/sec
>> [  5] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59291
>> [  5]  0.0-10.3 sec  6.62 MBytes  5.37 Mbits/sec
>> [  4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59306
>> [  4]  0.0-10.5 sec  6.62 MBytes  5.30 Mbits/sec
>>
>> Removed the rules for two instances on the same host:
>>
>> ubuntu@dom02:~$ sudo tc qdisc del dev vnet1 root
>> ubuntu@dom02:~$ sudo tc qdisc del dev vnet3 root
>> ubuntu@dom02:~$ sudo tc qdisc del dev vnet3 ingress
>> ubuntu@dom02:~$ sudo tc qdisc del dev vnet1 ingress
>> ubuntu@dom02:~$ tc -s qdisc ls dev vnet1
>> qdisc pfifo_fast 0: root refcnt 2 bands 3 priomap  1 2 2 2 1 2 0 0 1 
>> 1 1 1
>> 1 1 1 1
>>   Sent 7136572 bytes 1048 pkt (dropped 0, overlimits 0 requeues 0)
>>   backlog 0b 0p requeues 0
>>
>> And all of a sudden, those two instances are at blazing speeds:
>>
>> ubuntu@testserver01:~$ iperf -s
>> ------------------------------------------------------------
>> Server listening on TCP port 5001
>> TCP window size: 85.3 KByte (default)
>> ------------------------------------------------------------
>> [  4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59322
>> [ ID] Interval       Transfer     Bandwidth
>> [  4]  0.0-10.0 sec  14.8 GBytes  12.7 Gbits/sec
>> [  5] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59329
>> [  5]  0.0-10.0 sec  19.1 GBytes  16.4 Gbits/sec
>> [  4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59330
>> [  4]  0.0-10.0 sec  19.0 GBytes  16.3 Gbits/sec
>>
>>
>>
>>
>>
>> On Sun, Aug 17, 2014 at 12:46 PM, Nick Burke <ni...@nickburke.com> wrote:
>>
>>> First,
>>>
>>> THANK YOU FOR REPLYING!
>>>
>>> Second, yes, it's currently set at 200.
>>>
>>> The compute offering for network is either blank (or when I tested it,
>>> 1000)
>>> The network offering for network limit is either 100, 1000, or blank.
>>>
>>>
>>> Those are the only network throttling parameters that I'm aware of, are
>>> there any others that I missed? Is it possible disk i/o is for some 
>>> reason
>>> coming into play here?
>>>
>>> This happens regardless of if the instance network is either a virtual
>>> router or is directly connected to a vlan(ie, no virtual router) 
>>> when two
>>> instances are directly connected to each other.
>>>
>>>
>>>
>>>
>>>
>>> On Sun, Aug 17, 2014 at 12:09 PM, ilya musayev <
>>> ilya.mailing.lists@gmail.com> wrote:
>>>
>>>> Nick
>>>>
>>>> Have you checked network throttle settings in "global setting" and 
>>>> where
>>>> ever else it may be defined?
>>>>
>>>> regads
>>>> ilya
>>>>
>>>> On 8/17/14, 11:27 AM, Nick Burke wrote:
>>>>
>>>>> Update:
>>>>>
>>>>> After running nperf on same instances on the same virtual network, it
>>>>> looks
>>>>> like all instances can get no more than 2Mb/s. Additionally, it's
>>>>> sporadic
>>>>> and ranges from <1Mb/s, but never more than 2Mb/s:
>>>>>
>>>>> user@localhost:~$ iperf -c 10.1.0.1 -d
>>>>> ------------------------------------------------------------
>>>>> Server listening on TCP port 5001
>>>>> TCP window size: 85.3 KByte (default)
>>>>> ------------------------------------------------------------
>>>>> ------------------------------------------------------------
>>>>> Client connecting to 10.1.0.1, TCP port 5001
>>>>> TCP window size: 86.8 KByte (default)
>>>>> ------------------------------------------------------------
>>>>> [  5] local 10.1.0.10 port 50432 connected with 10.1.0.1 port 5001
>>>>> [ ID] Interval       Transfer     Bandwidth
>>>>> [  5]  0.0-11.0 sec  1.25 MBytes   950 Kbits/sec
>>>>> [  4] local 10.1.0.10 port 5001 connected with 10.1.0.1 port 53839
>>>>> [  4]  0.0-11.1 sec  2.50 MBytes  1.89 Mbits/sec
>>>>> user@localhost:~$ iperf -c 10.1.0.1 -d
>>>>> ------------------------------------------------------------
>>>>> Server listening on TCP port 5001
>>>>> TCP window size: 85.3 KByte (default)
>>>>> ------------------------------------------------------------
>>>>> ------------------------------------------------------------
>>>>> Client connecting to 10.1.0.1, TCP port 5001
>>>>> TCP window size: 50.3 KByte (default)
>>>>> ------------------------------------------------------------
>>>>> [  5] local 10.1.0.10 port 52248 connected with 10.1.0.1 port 5001
>>>>> [ ID] Interval       Transfer     Bandwidth
>>>>> [  5]  0.0-12.6 sec  1.25 MBytes   834 Kbits/sec
>>>>> [  4] local 10.1.0.10 port 5001 connected with 10.1.0.1 port 53840
>>>>> [  4]  0.0-11.9 sec  2.13 MBytes  1.49 Mbits/sec
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Aug 15, 2014 at 11:40 AM, Nick Burke <ni...@nickburke.com> 
>>>>> wrote:
>>>>>
>>>>>   I upgraded from 4.0 to 4.3.0 some time ago. I didn't restart 
>>>>> anything
>>>>>> and
>>>>>> it was all working great. However, I had to perform some 
>>>>>> maintenance and
>>>>>> had to restart everything. Now, I'm seeing packet loss on all 
>>>>>> virtuals,
>>>>>> even ones on the same host.
>>>>>>
>>>>>> sudo ping -c 500  -f 172.20.1.1
>>>>>> PING 172.20.1.1 (172.20.1.1) 56(84) bytes of data.
>>>>>> ........................................
>>>>>> --- 172.20.1.1 ping statistics ---
>>>>>> 500 packets transmitted, 460 received, 8% packet loss, time 864ms
>>>>>> rtt min/avg/max/mdev = 0.069/0.218/1.290/0.139 ms, ipg/ewma 
>>>>>> 1.731/0.328
>>>>>> ms
>>>>>>
>>>>>> No interface errors reported anywhere. The host itself isn't 
>>>>>> under load
>>>>>> at
>>>>>> all. Doesn't matter if the instance uses e1000 or virtio for the
>>>>>> drivers.
>>>>>> The only thing that I'm aware of that changed was that I had to 
>>>>>> reboot
>>>>>> all
>>>>>> the physical servers.
>>>>>>
>>>>>>
>>>>>> Could be related, but I was hit with the
>>>>>>
>>>>>> https://issues.apache.org/jira/browse/CLOUDSTACK-6464
>>>>>>
>>>>>> bug. I did follow with Marcus' suggestion:
>>>>>>
>>>>>>
>>>>>> *"This is a shot in the dark, but there have been some issues around
>>>>>>
>>>>>> upgrades that involve the cloud.vlan table expected contents 
>>>>>> changing.
>>>>>> New
>>>>>> 4.3 installs using vlan isolation don't seem to reproduce the issue.
>>>>>> I'll
>>>>>> see if I can reproduce anything like this with basic and/or non-vlan
>>>>>> isolated upgrades/installs. Can anyone experiencing an issue look at
>>>>>> their
>>>>>> database via something like "select * from cloud.vlan" and look 
>>>>>> at the
>>>>>> vlan_id. If you see something like "untagged" instead of
>>>>>> "vlan://untagged",
>>>>>> please try changing it and see if that helps."*
>>>>>>
>>>>>> -- 
>>>>>> Nick
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> *'What is a human being, then?' 'A seed' 'A... seed?' 'An acorn 
>>>>>> that is
>>>>>>
>>>>>> unafraid to destroy itself in growing into a tree.' -David 
>>>>>> Zindell, A
>>>>>> Requiem for Homo Sapiens*
>>>>>>
>>>>>>
>>>>>
>>>
>>> -- 
>>> Nick
>>>
>>>
>>>
>>>
>>>
>>> *'What is a human being, then?' 'A seed' 'A... seed?' 'An acorn that is
>>> unafraid to destroy itself in growing into a tree.' -David Zindell, A
>>> Requiem for Homo Sapiens*
>>>
>>
>>
>

Re: intermittent packet loss after upgrading and restarting networks

Posted by ilya musayev <il...@gmail.com>.

Nick,

I dont believe we throttle disks unless you have a storage that has 
direct integration to limits iops like solidfire or possibly netapp.

The change is rather simple, in the global settings level - override the 
throttle configs. They should generally be inherited from upstream, if 
it did not - let me know and i can try to point you to a db update  you 
can do.

Once thats done, next time you do a deployment of a vm, it will check 
the network portgroup it has created and update it. You can also try 
doing stop and start of the VM, it may update the portgroup configs as 
well (not 100% certain, but i think it will work). This behavior 
definitely applies to vmware, i'd think the same would go for other 
hypervisors like XEN and KVM - but I dont have XEN or KVM to try this on.

One other suggestion, I would ask on dev list, there is an updateNetwork 
api call that you could make - that presumable will update these 
settings, the description for this call is rather brief, hence devs 
would know better.

http://cloudstack.apache.org/docs/api/apidocs-4.4/root_admin/updateNetwork.html

Regards
ilya

On 8/17/14, 4:38 PM, Nick Burke wrote:
> Another update:
>
>
> 100% confirmed to be traffic shapping set by CloudStack. I don't know
> where/how/why, and I'd love some help with this. Should I create a new
> thread? As previously mentioned, I don't believe I've set a cap of below
> 100Mbs ANYWHERE in Cloudstack. Not in compute offerings, network offerings,
> and not in the default throttle (which is set at 200).
>
> What am I missing?
>
> I removed tc rules on the host for two test instances and bandwidth shot up.
>
> Before:
>
> ubuntu@testserver01:~$ iperf -s
> ------------------------------------------------------------
> Server listening on TCP port 5001
> TCP window size: 85.3 KByte (default)
> ------------------------------------------------------------
> [  4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59276
> [ ID] Interval       Transfer     Bandwidth
> [  4]  0.0-10.4 sec  6.62 MBytes  5.35 Mbits/sec
> [  5] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59277
> [  5]  0.0-10.5 sec  6.62 MBytes  5.28 Mbits/sec
> [  4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59278
> [  4]  0.0-10.4 sec  6.62 MBytes  5.37 Mbits/sec
> [  5] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59291
> [  5]  0.0-10.3 sec  6.62 MBytes  5.37 Mbits/sec
> [  4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59306
> [  4]  0.0-10.5 sec  6.62 MBytes  5.30 Mbits/sec
>
> Removed the rules for two instances on the same host:
>
> ubuntu@dom02:~$ sudo tc qdisc del dev vnet1 root
> ubuntu@dom02:~$ sudo tc qdisc del dev vnet3 root
> ubuntu@dom02:~$ sudo tc qdisc del dev vnet3 ingress
> ubuntu@dom02:~$ sudo tc qdisc del dev vnet1 ingress
> ubuntu@dom02:~$ tc -s qdisc ls dev vnet1
> qdisc pfifo_fast 0: root refcnt 2 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1
> 1 1 1 1
>   Sent 7136572 bytes 1048 pkt (dropped 0, overlimits 0 requeues 0)
>   backlog 0b 0p requeues 0
>
> And all of a sudden, those two instances are at blazing speeds:
>
> ubuntu@testserver01:~$ iperf -s
> ------------------------------------------------------------
> Server listening on TCP port 5001
> TCP window size: 85.3 KByte (default)
> ------------------------------------------------------------
> [  4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59322
> [ ID] Interval       Transfer     Bandwidth
> [  4]  0.0-10.0 sec  14.8 GBytes  12.7 Gbits/sec
> [  5] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59329
> [  5]  0.0-10.0 sec  19.1 GBytes  16.4 Gbits/sec
> [  4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59330
> [  4]  0.0-10.0 sec  19.0 GBytes  16.3 Gbits/sec
>
>
>
>
>
> On Sun, Aug 17, 2014 at 12:46 PM, Nick Burke <ni...@nickburke.com> wrote:
>
>> First,
>>
>> THANK YOU FOR REPLYING!
>>
>> Second, yes, it's currently set at 200.
>>
>> The compute offering for network is either blank (or when I tested it,
>> 1000)
>> The network offering for network limit is either 100, 1000, or blank.
>>
>>
>> Those are the only network throttling parameters that I'm aware of, are
>> there any others that I missed? Is it possible disk i/o is for some reason
>> coming into play here?
>>
>> This happens regardless of if the instance network is either a virtual
>> router or is directly connected to a vlan(ie, no virtual router) when two
>> instances are directly connected to each other.
>>
>>
>>
>>
>>
>> On Sun, Aug 17, 2014 at 12:09 PM, ilya musayev <
>> ilya.mailing.lists@gmail.com> wrote:
>>
>>> Nick
>>>
>>> Have you checked network throttle settings in "global setting" and where
>>> ever else it may be defined?
>>>
>>> regads
>>> ilya
>>>
>>> On 8/17/14, 11:27 AM, Nick Burke wrote:
>>>
>>>> Update:
>>>>
>>>> After running nperf on same instances on the same virtual network, it
>>>> looks
>>>> like all instances can get no more than 2Mb/s. Additionally, it's
>>>> sporadic
>>>> and ranges from <1Mb/s, but never more than 2Mb/s:
>>>>
>>>> user@localhost:~$ iperf -c 10.1.0.1 -d
>>>> ------------------------------------------------------------
>>>> Server listening on TCP port 5001
>>>> TCP window size: 85.3 KByte (default)
>>>> ------------------------------------------------------------
>>>> ------------------------------------------------------------
>>>> Client connecting to 10.1.0.1, TCP port 5001
>>>> TCP window size: 86.8 KByte (default)
>>>> ------------------------------------------------------------
>>>> [  5] local 10.1.0.10 port 50432 connected with 10.1.0.1 port 5001
>>>> [ ID] Interval       Transfer     Bandwidth
>>>> [  5]  0.0-11.0 sec  1.25 MBytes   950 Kbits/sec
>>>> [  4] local 10.1.0.10 port 5001 connected with 10.1.0.1 port 53839
>>>> [  4]  0.0-11.1 sec  2.50 MBytes  1.89 Mbits/sec
>>>> user@localhost:~$ iperf -c 10.1.0.1 -d
>>>> ------------------------------------------------------------
>>>> Server listening on TCP port 5001
>>>> TCP window size: 85.3 KByte (default)
>>>> ------------------------------------------------------------
>>>> ------------------------------------------------------------
>>>> Client connecting to 10.1.0.1, TCP port 5001
>>>> TCP window size: 50.3 KByte (default)
>>>> ------------------------------------------------------------
>>>> [  5] local 10.1.0.10 port 52248 connected with 10.1.0.1 port 5001
>>>> [ ID] Interval       Transfer     Bandwidth
>>>> [  5]  0.0-12.6 sec  1.25 MBytes   834 Kbits/sec
>>>> [  4] local 10.1.0.10 port 5001 connected with 10.1.0.1 port 53840
>>>> [  4]  0.0-11.9 sec  2.13 MBytes  1.49 Mbits/sec
>>>>
>>>>
>>>>
>>>> On Fri, Aug 15, 2014 at 11:40 AM, Nick Burke <ni...@nickburke.com> wrote:
>>>>
>>>>   I upgraded from 4.0 to 4.3.0 some time ago. I didn't restart anything
>>>>> and
>>>>> it was all working great. However, I had to perform some maintenance and
>>>>> had to restart everything. Now, I'm seeing packet loss on all virtuals,
>>>>> even ones on the same host.
>>>>>
>>>>> sudo ping -c 500  -f 172.20.1.1
>>>>> PING 172.20.1.1 (172.20.1.1) 56(84) bytes of data.
>>>>> ........................................
>>>>> --- 172.20.1.1 ping statistics ---
>>>>> 500 packets transmitted, 460 received, 8% packet loss, time 864ms
>>>>> rtt min/avg/max/mdev = 0.069/0.218/1.290/0.139 ms, ipg/ewma 1.731/0.328
>>>>> ms
>>>>>
>>>>> No interface errors reported anywhere. The host itself isn't under load
>>>>> at
>>>>> all. Doesn't matter if the instance uses e1000 or virtio for the
>>>>> drivers.
>>>>> The only thing that I'm aware of that changed was that I had to reboot
>>>>> all
>>>>> the physical servers.
>>>>>
>>>>>
>>>>> Could be related, but I was hit with the
>>>>>
>>>>> https://issues.apache.org/jira/browse/CLOUDSTACK-6464
>>>>>
>>>>> bug. I did follow with Marcus' suggestion:
>>>>>
>>>>>
>>>>> *"This is a shot in the dark, but there have been some issues around
>>>>>
>>>>> upgrades that involve the cloud.vlan table expected contents changing.
>>>>> New
>>>>> 4.3 installs using vlan isolation don't seem to reproduce the issue.
>>>>> I'll
>>>>> see if I can reproduce anything like this with basic and/or non-vlan
>>>>> isolated upgrades/installs. Can anyone experiencing an issue look at
>>>>> their
>>>>> database via something like "select * from cloud.vlan" and look at the
>>>>> vlan_id. If you see something like "untagged" instead of
>>>>> "vlan://untagged",
>>>>> please try changing it and see if that helps."*
>>>>>
>>>>> --
>>>>> Nick
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> *'What is a human being, then?' 'A seed' 'A... seed?' 'An acorn that is
>>>>>
>>>>> unafraid to destroy itself in growing into a tree.' -David Zindell, A
>>>>> Requiem for Homo Sapiens*
>>>>>
>>>>>
>>>>
>>
>> --
>> Nick
>>
>>
>>
>>
>>
>> *'What is a human being, then?' 'A seed' 'A... seed?' 'An acorn that is
>> unafraid to destroy itself in growing into a tree.' -David Zindell, A
>> Requiem for Homo Sapiens*
>>
>
>

Re: intermittent packet loss after upgrading and restarting networks

Posted by Nick Burke <ni...@nickburke.com>.

Another update:


100% confirmed to be traffic shapping set by CloudStack. I don't know
where/how/why, and I'd love some help with this. Should I create a new
thread? As previously mentioned, I don't believe I've set a cap of below
100Mbs ANYWHERE in Cloudstack. Not in compute offerings, network offerings,
and not in the default throttle (which is set at 200).

What am I missing?

I removed tc rules on the host for two test instances and bandwidth shot up.

Before:

ubuntu@testserver01:~$ iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[  4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59276
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.4 sec  6.62 MBytes  5.35 Mbits/sec
[  5] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59277
[  5]  0.0-10.5 sec  6.62 MBytes  5.28 Mbits/sec
[  4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59278
[  4]  0.0-10.4 sec  6.62 MBytes  5.37 Mbits/sec
[  5] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59291
[  5]  0.0-10.3 sec  6.62 MBytes  5.37 Mbits/sec
[  4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59306
[  4]  0.0-10.5 sec  6.62 MBytes  5.30 Mbits/sec

Removed the rules for two instances on the same host:

ubuntu@dom02:~$ sudo tc qdisc del dev vnet1 root
ubuntu@dom02:~$ sudo tc qdisc del dev vnet3 root
ubuntu@dom02:~$ sudo tc qdisc del dev vnet3 ingress
ubuntu@dom02:~$ sudo tc qdisc del dev vnet1 ingress
ubuntu@dom02:~$ tc -s qdisc ls dev vnet1
qdisc pfifo_fast 0: root refcnt 2 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1
1 1 1 1
 Sent 7136572 bytes 1048 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0

And all of a sudden, those two instances are at blazing speeds:

ubuntu@testserver01:~$ iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[  4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59322
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.0 sec  14.8 GBytes  12.7 Gbits/sec
[  5] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59329
[  5]  0.0-10.0 sec  19.1 GBytes  16.4 Gbits/sec
[  4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59330
[  4]  0.0-10.0 sec  19.0 GBytes  16.3 Gbits/sec





On Sun, Aug 17, 2014 at 12:46 PM, Nick Burke <ni...@nickburke.com> wrote:

> First,
>
> THANK YOU FOR REPLYING!
>
> Second, yes, it's currently set at 200.
>
> The compute offering for network is either blank (or when I tested it,
> 1000)
> The network offering for network limit is either 100, 1000, or blank.
>
>
> Those are the only network throttling parameters that I'm aware of, are
> there any others that I missed? Is it possible disk i/o is for some reason
> coming into play here?
>
> This happens regardless of if the instance network is either a virtual
> router or is directly connected to a vlan(ie, no virtual router) when two
> instances are directly connected to each other.
>
>
>
>
>
> On Sun, Aug 17, 2014 at 12:09 PM, ilya musayev <
> ilya.mailing.lists@gmail.com> wrote:
>
>> Nick
>>
>> Have you checked network throttle settings in "global setting" and where
>> ever else it may be defined?
>>
>> regads
>> ilya
>>
>> On 8/17/14, 11:27 AM, Nick Burke wrote:
>>
>>> Update:
>>>
>>> After running nperf on same instances on the same virtual network, it
>>> looks
>>> like all instances can get no more than 2Mb/s. Additionally, it's
>>> sporadic
>>> and ranges from <1Mb/s, but never more than 2Mb/s:
>>>
>>> user@localhost:~$ iperf -c 10.1.0.1 -d
>>> ------------------------------------------------------------
>>> Server listening on TCP port 5001
>>> TCP window size: 85.3 KByte (default)
>>> ------------------------------------------------------------
>>> ------------------------------------------------------------
>>> Client connecting to 10.1.0.1, TCP port 5001
>>> TCP window size: 86.8 KByte (default)
>>> ------------------------------------------------------------
>>> [  5] local 10.1.0.10 port 50432 connected with 10.1.0.1 port 5001
>>> [ ID] Interval       Transfer     Bandwidth
>>> [  5]  0.0-11.0 sec  1.25 MBytes   950 Kbits/sec
>>> [  4] local 10.1.0.10 port 5001 connected with 10.1.0.1 port 53839
>>> [  4]  0.0-11.1 sec  2.50 MBytes  1.89 Mbits/sec
>>> user@localhost:~$ iperf -c 10.1.0.1 -d
>>> ------------------------------------------------------------
>>> Server listening on TCP port 5001
>>> TCP window size: 85.3 KByte (default)
>>> ------------------------------------------------------------
>>> ------------------------------------------------------------
>>> Client connecting to 10.1.0.1, TCP port 5001
>>> TCP window size: 50.3 KByte (default)
>>> ------------------------------------------------------------
>>> [  5] local 10.1.0.10 port 52248 connected with 10.1.0.1 port 5001
>>> [ ID] Interval       Transfer     Bandwidth
>>> [  5]  0.0-12.6 sec  1.25 MBytes   834 Kbits/sec
>>> [  4] local 10.1.0.10 port 5001 connected with 10.1.0.1 port 53840
>>> [  4]  0.0-11.9 sec  2.13 MBytes  1.49 Mbits/sec
>>>
>>>
>>>
>>> On Fri, Aug 15, 2014 at 11:40 AM, Nick Burke <ni...@nickburke.com> wrote:
>>>
>>>  I upgraded from 4.0 to 4.3.0 some time ago. I didn't restart anything
>>>> and
>>>> it was all working great. However, I had to perform some maintenance and
>>>> had to restart everything. Now, I'm seeing packet loss on all virtuals,
>>>> even ones on the same host.
>>>>
>>>> sudo ping -c 500  -f 172.20.1.1
>>>> PING 172.20.1.1 (172.20.1.1) 56(84) bytes of data.
>>>> ........................................
>>>> --- 172.20.1.1 ping statistics ---
>>>> 500 packets transmitted, 460 received, 8% packet loss, time 864ms
>>>> rtt min/avg/max/mdev = 0.069/0.218/1.290/0.139 ms, ipg/ewma 1.731/0.328
>>>> ms
>>>>
>>>> No interface errors reported anywhere. The host itself isn't under load
>>>> at
>>>> all. Doesn't matter if the instance uses e1000 or virtio for the
>>>> drivers.
>>>> The only thing that I'm aware of that changed was that I had to reboot
>>>> all
>>>> the physical servers.
>>>>
>>>>
>>>> Could be related, but I was hit with the
>>>>
>>>> https://issues.apache.org/jira/browse/CLOUDSTACK-6464
>>>>
>>>> bug. I did follow with Marcus' suggestion:
>>>>
>>>>
>>>> *"This is a shot in the dark, but there have been some issues around
>>>>
>>>> upgrades that involve the cloud.vlan table expected contents changing.
>>>> New
>>>> 4.3 installs using vlan isolation don't seem to reproduce the issue.
>>>> I'll
>>>> see if I can reproduce anything like this with basic and/or non-vlan
>>>> isolated upgrades/installs. Can anyone experiencing an issue look at
>>>> their
>>>> database via something like "select * from cloud.vlan" and look at the
>>>> vlan_id. If you see something like "untagged" instead of
>>>> "vlan://untagged",
>>>> please try changing it and see if that helps."*
>>>>
>>>> --
>>>> Nick
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> *'What is a human being, then?' 'A seed' 'A... seed?' 'An acorn that is
>>>>
>>>> unafraid to destroy itself in growing into a tree.' -David Zindell, A
>>>> Requiem for Homo Sapiens*
>>>>
>>>>
>>>
>>>
>>
>
>
> --
> Nick
>
>
>
>
>
> *'What is a human being, then?' 'A seed' 'A... seed?' 'An acorn that is
> unafraid to destroy itself in growing into a tree.' -David Zindell, A
> Requiem for Homo Sapiens*
>



-- 
Nick





*'What is a human being, then?' 'A seed' 'A... seed?' 'An acorn that is
unafraid to destroy itself in growing into a tree.' -David Zindell, A
Requiem for Homo Sapiens*

Re: intermittent packet loss after upgrading and restarting networks

Posted by Nick Burke <ni...@nickburke.com>.

First,

THANK YOU FOR REPLYING!

Second, yes, it's currently set at 200.

The compute offering for network is either blank (or when I tested it, 1000)

The network offering for network limit is either 100, 1000, or blank.


Those are the only network throttling parameters that I'm aware of, are
there any others that I missed? Is it possible disk i/o is for some reason
coming into play here?

This happens regardless of if the instance network is either a virtual
router or is directly connected to a vlan(ie, no virtual router) when two
instances are directly connected to each other.





On Sun, Aug 17, 2014 at 12:09 PM, ilya musayev <ilya.mailing.lists@gmail.com
> wrote:

> Nick
>
> Have you checked network throttle settings in "global setting" and where
> ever else it may be defined?
>
> regads
> ilya
>
> On 8/17/14, 11:27 AM, Nick Burke wrote:
>
>> Update:
>>
>> After running nperf on same instances on the same virtual network, it
>> looks
>> like all instances can get no more than 2Mb/s. Additionally, it's sporadic
>> and ranges from <1Mb/s, but never more than 2Mb/s:
>>
>> user@localhost:~$ iperf -c 10.1.0.1 -d
>> ------------------------------------------------------------
>> Server listening on TCP port 5001
>> TCP window size: 85.3 KByte (default)
>> ------------------------------------------------------------
>> ------------------------------------------------------------
>> Client connecting to 10.1.0.1, TCP port 5001
>> TCP window size: 86.8 KByte (default)
>> ------------------------------------------------------------
>> [  5] local 10.1.0.10 port 50432 connected with 10.1.0.1 port 5001
>> [ ID] Interval       Transfer     Bandwidth
>> [  5]  0.0-11.0 sec  1.25 MBytes   950 Kbits/sec
>> [  4] local 10.1.0.10 port 5001 connected with 10.1.0.1 port 53839
>> [  4]  0.0-11.1 sec  2.50 MBytes  1.89 Mbits/sec
>> user@localhost:~$ iperf -c 10.1.0.1 -d
>> ------------------------------------------------------------
>> Server listening on TCP port 5001
>> TCP window size: 85.3 KByte (default)
>> ------------------------------------------------------------
>> ------------------------------------------------------------
>> Client connecting to 10.1.0.1, TCP port 5001
>> TCP window size: 50.3 KByte (default)
>> ------------------------------------------------------------
>> [  5] local 10.1.0.10 port 52248 connected with 10.1.0.1 port 5001
>> [ ID] Interval       Transfer     Bandwidth
>> [  5]  0.0-12.6 sec  1.25 MBytes   834 Kbits/sec
>> [  4] local 10.1.0.10 port 5001 connected with 10.1.0.1 port 53840
>> [  4]  0.0-11.9 sec  2.13 MBytes  1.49 Mbits/sec
>>
>>
>>
>> On Fri, Aug 15, 2014 at 11:40 AM, Nick Burke <ni...@nickburke.com> wrote:
>>
>>  I upgraded from 4.0 to 4.3.0 some time ago. I didn't restart anything and
>>> it was all working great. However, I had to perform some maintenance and
>>> had to restart everything. Now, I'm seeing packet loss on all virtuals,
>>> even ones on the same host.
>>>
>>> sudo ping -c 500  -f 172.20.1.1
>>> PING 172.20.1.1 (172.20.1.1) 56(84) bytes of data.
>>> ........................................
>>> --- 172.20.1.1 ping statistics ---
>>> 500 packets transmitted, 460 received, 8% packet loss, time 864ms
>>> rtt min/avg/max/mdev = 0.069/0.218/1.290/0.139 ms, ipg/ewma 1.731/0.328
>>> ms
>>>
>>> No interface errors reported anywhere. The host itself isn't under load
>>> at
>>> all. Doesn't matter if the instance uses e1000 or virtio for the drivers.
>>> The only thing that I'm aware of that changed was that I had to reboot
>>> all
>>> the physical servers.
>>>
>>>
>>> Could be related, but I was hit with the
>>>
>>> https://issues.apache.org/jira/browse/CLOUDSTACK-6464
>>>
>>> bug. I did follow with Marcus' suggestion:
>>>
>>>
>>> *"This is a shot in the dark, but there have been some issues around
>>>
>>> upgrades that involve the cloud.vlan table expected contents changing.
>>> New
>>> 4.3 installs using vlan isolation don't seem to reproduce the issue. I'll
>>> see if I can reproduce anything like this with basic and/or non-vlan
>>> isolated upgrades/installs. Can anyone experiencing an issue look at
>>> their
>>> database via something like "select * from cloud.vlan" and look at the
>>> vlan_id. If you see something like "untagged" instead of
>>> "vlan://untagged",
>>> please try changing it and see if that helps."*
>>>
>>> --
>>> Nick
>>>
>>>
>>>
>>>
>>>
>>> *'What is a human being, then?' 'A seed' 'A... seed?' 'An acorn that is
>>>
>>> unafraid to destroy itself in growing into a tree.' -David Zindell, A
>>> Requiem for Homo Sapiens*
>>>
>>>
>>
>>
>


-- 
Nick





*'What is a human being, then?' 'A seed' 'A... seed?' 'An acorn that is
unafraid to destroy itself in growing into a tree.' -David Zindell, A
Requiem for Homo Sapiens*

Re: intermittent packet loss after upgrading and restarting networks

Posted by ilya musayev <il...@gmail.com>.

Nick

Have you checked network throttle settings in "global setting" and where 
ever else it may be defined?

regads
ilya
On 8/17/14, 11:27 AM, Nick Burke wrote:
> Update:
>
> After running nperf on same instances on the same virtual network, it looks
> like all instances can get no more than 2Mb/s. Additionally, it's sporadic
> and ranges from <1Mb/s, but never more than 2Mb/s:
>
> user@localhost:~$ iperf -c 10.1.0.1 -d
> ------------------------------------------------------------
> Server listening on TCP port 5001
> TCP window size: 85.3 KByte (default)
> ------------------------------------------------------------
> ------------------------------------------------------------
> Client connecting to 10.1.0.1, TCP port 5001
> TCP window size: 86.8 KByte (default)
> ------------------------------------------------------------
> [  5] local 10.1.0.10 port 50432 connected with 10.1.0.1 port 5001
> [ ID] Interval       Transfer     Bandwidth
> [  5]  0.0-11.0 sec  1.25 MBytes   950 Kbits/sec
> [  4] local 10.1.0.10 port 5001 connected with 10.1.0.1 port 53839
> [  4]  0.0-11.1 sec  2.50 MBytes  1.89 Mbits/sec
> user@localhost:~$ iperf -c 10.1.0.1 -d
> ------------------------------------------------------------
> Server listening on TCP port 5001
> TCP window size: 85.3 KByte (default)
> ------------------------------------------------------------
> ------------------------------------------------------------
> Client connecting to 10.1.0.1, TCP port 5001
> TCP window size: 50.3 KByte (default)
> ------------------------------------------------------------
> [  5] local 10.1.0.10 port 52248 connected with 10.1.0.1 port 5001
> [ ID] Interval       Transfer     Bandwidth
> [  5]  0.0-12.6 sec  1.25 MBytes   834 Kbits/sec
> [  4] local 10.1.0.10 port 5001 connected with 10.1.0.1 port 53840
> [  4]  0.0-11.9 sec  2.13 MBytes  1.49 Mbits/sec
>
>
>
> On Fri, Aug 15, 2014 at 11:40 AM, Nick Burke <ni...@nickburke.com> wrote:
>
>> I upgraded from 4.0 to 4.3.0 some time ago. I didn't restart anything and
>> it was all working great. However, I had to perform some maintenance and
>> had to restart everything. Now, I'm seeing packet loss on all virtuals,
>> even ones on the same host.
>>
>> sudo ping -c 500  -f 172.20.1.1
>> PING 172.20.1.1 (172.20.1.1) 56(84) bytes of data.
>> ........................................
>> --- 172.20.1.1 ping statistics ---
>> 500 packets transmitted, 460 received, 8% packet loss, time 864ms
>> rtt min/avg/max/mdev = 0.069/0.218/1.290/0.139 ms, ipg/ewma 1.731/0.328 ms
>>
>> No interface errors reported anywhere. The host itself isn't under load at
>> all. Doesn't matter if the instance uses e1000 or virtio for the drivers.
>> The only thing that I'm aware of that changed was that I had to reboot all
>> the physical servers.
>>
>>
>> Could be related, but I was hit with the
>>
>> https://issues.apache.org/jira/browse/CLOUDSTACK-6464
>>
>> bug. I did follow with Marcus' suggestion:
>>
>>
>> *"This is a shot in the dark, but there have been some issues around
>> upgrades that involve the cloud.vlan table expected contents changing. New
>> 4.3 installs using vlan isolation don't seem to reproduce the issue. I'll
>> see if I can reproduce anything like this with basic and/or non-vlan
>> isolated upgrades/installs. Can anyone experiencing an issue look at their
>> database via something like "select * from cloud.vlan" and look at the
>> vlan_id. If you see something like "untagged" instead of "vlan://untagged",
>> please try changing it and see if that helps."*
>>
>> --
>> Nick
>>
>>
>>
>>
>>
>> *'What is a human being, then?' 'A seed' 'A... seed?' 'An acorn that is
>> unafraid to destroy itself in growing into a tree.' -David Zindell, A
>> Requiem for Homo Sapiens*
>>
>
>

Re: intermittent packet loss after upgrading and restarting networks

Posted by Nick Burke <ni...@nickburke.com>.

Update:

After running nperf on same instances on the same virtual network, it looks
like all instances can get no more than 2Mb/s. Additionally, it's sporadic
and ranges from <1Mb/s, but never more than 2Mb/s:

user@localhost:~$ iperf -c 10.1.0.1 -d
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 10.1.0.1, TCP port 5001
TCP window size: 86.8 KByte (default)
------------------------------------------------------------
[  5] local 10.1.0.10 port 50432 connected with 10.1.0.1 port 5001
[ ID] Interval       Transfer     Bandwidth
[  5]  0.0-11.0 sec  1.25 MBytes   950 Kbits/sec
[  4] local 10.1.0.10 port 5001 connected with 10.1.0.1 port 53839
[  4]  0.0-11.1 sec  2.50 MBytes  1.89 Mbits/sec
user@localhost:~$ iperf -c 10.1.0.1 -d
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 10.1.0.1, TCP port 5001
TCP window size: 50.3 KByte (default)
------------------------------------------------------------
[  5] local 10.1.0.10 port 52248 connected with 10.1.0.1 port 5001
[ ID] Interval       Transfer     Bandwidth
[  5]  0.0-12.6 sec  1.25 MBytes   834 Kbits/sec
[  4] local 10.1.0.10 port 5001 connected with 10.1.0.1 port 53840
[  4]  0.0-11.9 sec  2.13 MBytes  1.49 Mbits/sec



On Fri, Aug 15, 2014 at 11:40 AM, Nick Burke <ni...@nickburke.com> wrote:

>
> I upgraded from 4.0 to 4.3.0 some time ago. I didn't restart anything and
> it was all working great. However, I had to perform some maintenance and
> had to restart everything. Now, I'm seeing packet loss on all virtuals,
> even ones on the same host.
>
> sudo ping -c 500  -f 172.20.1.1
> PING 172.20.1.1 (172.20.1.1) 56(84) bytes of data.
> ........................................
> --- 172.20.1.1 ping statistics ---
> 500 packets transmitted, 460 received, 8% packet loss, time 864ms
> rtt min/avg/max/mdev = 0.069/0.218/1.290/0.139 ms, ipg/ewma 1.731/0.328 ms
>
> No interface errors reported anywhere. The host itself isn't under load at
> all. Doesn't matter if the instance uses e1000 or virtio for the drivers.
> The only thing that I'm aware of that changed was that I had to reboot all
> the physical servers.
>
>
> Could be related, but I was hit with the
>
> https://issues.apache.org/jira/browse/CLOUDSTACK-6464
>
> bug. I did follow with Marcus' suggestion:
>
>
> *"This is a shot in the dark, but there have been some issues around
> upgrades that involve the cloud.vlan table expected contents changing. New
> 4.3 installs using vlan isolation don't seem to reproduce the issue. I'll
> see if I can reproduce anything like this with basic and/or non-vlan
> isolated upgrades/installs. Can anyone experiencing an issue look at their
> database via something like "select * from cloud.vlan" and look at the
> vlan_id. If you see something like "untagged" instead of "vlan://untagged",
> please try changing it and see if that helps."*
>
> --
> Nick
>
>
>
>
>
> *'What is a human being, then?' 'A seed' 'A... seed?' 'An acorn that is
> unafraid to destroy itself in growing into a tree.' -David Zindell, A
> Requiem for Homo Sapiens*
>



-- 
Nick





*'What is a human being, then?' 'A seed' 'A... seed?' 'An acorn that is
unafraid to destroy itself in growing into a tree.' -David Zindell, A
Requiem for Homo Sapiens*