You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cloudstack.apache.org by Remi Bergsma <RB...@schubergphilis.com> on 2016/05/01 13:18:14 UTC

Re: regular disconnection every 2-3 mins on haproxy cloudstack mgmt server LB implementation

You're welkome! Great to hear it now works :-)

Sent from my iPhone

> On 30 Apr 2016, at 23:13, Indra Pramana <in...@sg.or.id> wrote:
> 
> Hi Remi,
> 
> Many thanks for your tips! You are right indeed, by default haproxy has
> timeout set to 50 seconds (or 50,000ms). I changed it to 120 seconds (2
> minutes) and have monitored the agent connection and it's no longer
> flapping. Monitoring now and will let you know if there are any further
> issues.
> 
> On /etc/haproxy/haproxy.cfg, modify this under defaults section:
> 
>        timeout client  120000
>        timeout server  120000
>        #timeout client  50000
>        #timeout server  50000
> 
> and restart the haproxy service for the changes to take effect.
> 
> Again, many thanks Remi, greatly appreciated. :)
> 
> -ip-
> 
> 
> On Sun, May 1, 2016 at 4:30 AM, Remi Bergsma <RB...@schubergphilis.com>
> wrote:
> 
>> Hi,
>> 
>> You may want to check if HAproxy keeps the connection open long enough.
>> The system vms send traffic every minute or so and you want haproxy to keep
>> the connection open longer than that. If haproxy is set to close idle
>> connections after say 30sec then this will result in flapping connections.
>> 
>> Regards, Remi
>> 
>> Sent from my iPhone
>> 
>>> On 30 Apr 2016, at 22:00, Indra Pramana <in...@sg.or.id> wrote:
>>> 
>>> Dear all,
>>> 
>>> We are running CloudStack 4.2.0 with KVM hypervisor and Ceph RBD storage.
>>> We implemented secondary CloudStack management server and haproxy load
>>> balancer, and tonight we changed our configuration so that the CloudStack
>>> agents will be connecting to the LB IP rather than the CS mgmt server
>>> directly.
>>> 
>>> However, we noted that the agent will be regularly disconnected every 2-3
>>> minutes. Here are the excerpts on the agent.log:
>>> 
>>> ====
>>> 2016-05-01 01:30:10,982 DEBUG [utils.nio.NioConnection]
>>> (Agent-Selector:null) Location 1: Socket
>>> Socket[addr=/X.X.X.8,port=8250,localport=50613] closed on read.  Pro
>>> bably -1 returned: Connection closed with -1 on reading size.
>>> 2016-05-01 01:30:10,983 DEBUG [utils.nio.NioConnection]
>>> (Agent-Selector:null) Closing socket
>>> Socket[addr=/X.X.X.8,port=8250,localport=50613]
>>> 2016-05-01 01:30:10,983 DEBUG [cloud.agent.Agent] (Agent-Handler-3:null)
>>> Clearing watch list: 2
>>> 2016-05-01 01:30:15,984 INFO  [cloud.agent.Agent] (Agent-Handler-3:null)
>>> Lost connection to the server. Dealing with the remaining commands...
>>> 2016-05-01 01:30:20,985 INFO  [cloud.agent.Agent] (Agent-Handler-3:null)
>>> Reconnecting...
>>> 2016-05-01 01:30:20,986 INFO  [utils.nio.NioClient] (Agent-Selector:null)
>>> Connecting to X.X.X.8:8250
>>> 2016-05-01 01:30:21,101 INFO  [utils.nio.NioClient] (Agent-Selector:null)
>>> SSL: Handshake done
>>> 2016-05-01 01:30:21,101 INFO  [utils.nio.NioClient] (Agent-Selector:null)
>>> Connected to X.X.X.8:8250
>>> 2016-05-01 01:30:21,133 DEBUG [kvm.resource.LibvirtCapXMLParser]
>>> (Agent-Handler-1:null) Found /usr/bin/kvm as a suiteable emulator
>>> 2016-05-01 01:30:21,134 DEBUG [kvm.resource.LibvirtComputingResource]
>>> (Agent-Handler-1:null) Executing: /bin/bash -c qemu-img --help|grep
>> convert
>>> 2016-05-01 01:30:21,152 DEBUG [kvm.resource.LibvirtComputingResource]
>>> (Agent-Handler-1:null) Execution is successful.
>>> 2016-05-01 01:30:21,152 DEBUG [kvm.resource.LibvirtComputingResource]
>>> (Agent-Handler-1:null)   convert [-c] [-p] [-q] [-n] [-f fmt] [-t cache]
>>> [-T src_cache] [-O output
>>> _fmt] [-o options] [-s snapshot_id_or_name] [-l snapshot_param] [-S
>>> sparse_size] filename [filename2 [...]] output_filename
>>>   options are: 'none', 'writeback' (default, except for convert),
>>> 'writethrough',
>>>   'directsync' and 'unsafe' (default for convert)
>>> ====
>>> 
>>> and then it will reconnected again, and then got disconnected again, etc.
>>> It will continue on that loop.
>>> 
>>> haproxy.cfg configuration for the NIC facing the hypervisors:
>>> 
>>> ====
>>> listen cloudstack_systemvm_8250
>>> bind X.X.X.8:8250
>>> mode tcp
>>> option tcplog
>>> balance source
>>> server management-server-01.xxx.com X.X.X.3:8250 maxconn 32 check
>>> server management-server-02.xxx.com X.X.X.6:8250 maxconn 32 check
>>> ====
>>> 
>>> Note that .3 and .6 are the first and second CloudStack management
>> servers
>>> respectively, while .8 is the IP of the load balancer. We are using one
>> LB
>>> at the moment.
>>> 
>>> Nothing much is found on the haproxy.log, see below.
>>> 
>>> ====
>>> May  1 01:14:41 cs-haproxy-02 haproxy[923]: X.X.X.28:50401
>>> [01/May/2016:01:12:50.803] cloudstack_systemvm_8250
>>> cloudstack_systemvm_8250/management-server-01.xxx.com 1/0/110340 8584 cD
>>> 0/0/0/0/0 0/0
>>> May  1 01:15:46 cs-haproxy-02 haproxy[923]: X.X.X.28:50402
>>> [01/May/2016:01:14:51.150] cloudstack_systemvm_8250
>>> cloudstack_systemvm_8250/management-server-01.xxx.com 1/0/54920 8234 cD
>>> 0/0/0/0/0 0/0
>>> May  1 01:16:47 cs-haproxy-02 haproxy[923]: X.X.X.28:50403
>>> [01/May/2016:01:15:56.075] cloudstack_systemvm_8250
>>> cloudstack_systemvm_8250/management-server-01.xxx.com 1/0/51344 7868 cD
>>> 0/0/0/0/0 0/0
>>> May  1 01:17:48 cs-haproxy-02 haproxy[923]: X.X.X.28:50404
>>> [01/May/2016:01:16:57.426] cloudstack_systemvm_8250
>>> cloudstack_systemvm_8250/management-server-01.xxx.com 1/0/50854 7630 cD
>>> 0/0/0/0/0 0/0
>>> May  1 01:18:49 cs-haproxy-02 haproxy[923]: X.X.X.28:50405
>>> [01/May/2016:01:17:58.285] cloudstack_systemvm_8250
>>> cloudstack_systemvm_8250/management-server-01.xxx.com 1/0/50955 7630 cD
>>> 0/0/0/0/0 0/0
>>> May  1 01:24:49 cs-haproxy-02 haproxy[923]: X.X.X.28:50406
>>> [01/May/2016:01:18:59.245] cloudstack_systemvm_8250
>>> cloudstack_systemvm_8250/management-server-01.xxx.com 1/0/350361 14638
>> cD
>>> 0/0/0/0/0 0/0
>>> May  1 01:28:00 cs-haproxy-02 haproxy[923]: X.X.X.28:50571
>>> [01/May/2016:01:27:09.638] cloudstack_systemvm_8250
>>> cloudstack_systemvm_8250/management-server-01.xxx.com 1/0/50602 2852 cD
>>> 0/0/0/0/0 0/0
>>> May  1 01:30:11 cs-haproxy-02 haproxy[923]: X.X.X.28:50613
>>> [01/May/2016:01:29:20.260] cloudstack_systemvm_8250
>>> cloudstack_systemvm_8250/management-server-01.xxx.com 1/0/50876 7630 cD
>>> 0/0/0/0/0 0/0
>>> May  1 01:32:11 cs-haproxy-02 haproxy[923]: X.X.X.28:50614
>>> [01/May/2016:01:30:21.142] cloudstack_systemvm_8250
>>> cloudstack_systemvm_8250/management-server-01.xxx.com 1/0/110308 8870 cD
>>> 0/0/0/0/0 0/0
>>> ====
>>> 
>>> Note that .28 is the IP address of the hypervisor I tested to connect to
>>> the LB IP. Did I missed out anything on the haproxy configuration? Any
>>> advice is greatly appreciated.
>>> 
>>> Looking forward to your reply, thank you.
>>> 
>>> Cheers.
>>